In this script, we will conduct four sensitivity analyses:
Because we are preparing the sensitivity analysis, the first part will be the same, but using only the melanoma subgroup and adding the LDH variable.
Empty R environment
rm(list = ls())
Install packages
install.packages("dplyr")
install.packages("tidyr")
install.packages("table1")
install.packages("gtsummary")
install.packages("ggplot2")
install.packages("rms")
install.packages("survival")
install.packages("cmprsk")
install.packages("tidycmprsk")
install.packages("survminer")
install.packages("ggsurvfit")
install.packages("ggpubr")
Activate packages
library(dplyr)
library(tidyr)
library(table1)
library(gtsummary)
library(ggplot2)
library(rms)
library(survival)
library(cmprsk)
library(tidycmprsk)
library(survminer)
library(ggsurvfit)
library(ggpubr)
Make function to make binned Residual vs Fitted plot.
residual_lp_binned <- function(data,
fit,
title = expression(paste("Binned dev. residuals against ",hat(eta))),
n_breaks = max(c(10, min(c(floor(nrow(data)/10), 50))))
){
if(!is.null(fit$na.action)){
data <- data[-c(fit$na.action),]
}
data <- dplyr::mutate(data, residuals=residuals(fit, type = "pearson"),linpred=predict(fit, type = "link"))
gdf <- dplyr::group_by(data, cut(linpred, breaks=unique(quantile(linpred, (1:(n_breaks))/(n_breaks+1)))))
diagdf <- dplyr::summarise(gdf, residuals=mean(residuals),linpred=mean(linpred))
data$residuals <- data$linpred <- NULL
ggplot2::ggplot(diagdf, aes(y=residuals, x=linpred)
)+ ggplot2::geom_hline(yintercept = 0
,color = "darkgrey"
)+ ggplot2::geom_point(
)+ ggplot2::labs(x=expression(hat(eta))
,y="pearson residuals (binned)"
,title=title
,caption = deparse1(fit$formula)
)+ ggplot2::theme_bw(
)
}
We use the table1 package to make a table of patient
characteristics. See the vignette
of table1 for more details.
The functions below define how data are displayed within the table.
#function to exclude missings form percentages
my.render.cat <- function(x) {
c("", sapply(stats.default(x), function(y) with(y,
if(PCTnoNA%in%c("NaN")){sprintf("")} #if all categories within one variable of a stratum are missing, no output
else{sprintf("%d (%0.1f%%)", FREQ, PCTnoNA)} #otherwise percentages excluding missings
)))
}
#function to show percentage of missings if applicable
my.render.missing <- function(x, ...) {
with(stats.apply.rounding(stats.default(is.na(x), ...), ...)$Yes,
if(PCT==100){c(missing = sprintf(""))} #if variable is missing for all subjects within this stratum, don't show missings
else( c(missing=sprintf("%s (%s%%)", FREQ, PCT))) #otherwise, show percentages of missings
)
}
#function to show continuous variables as median [Q1, Q3]
my.render.cont <- function(x, name, ...) {
with(stats.default(x),
c("", "median [Q1-Q3]"=sprintf("%s [%s-%s]"
, round_pad(MEDIAN, digits=1)
, round_pad(Q1, digits=1)
, round_pad(Q3, digits=1)
)
,"mean (SD)"=sprintf("%s (%s)"
, round_pad(MEAN, digits=1)
, round_pad(SD, digits=1))
))
}
Define working directory and load files
Load the data after preprocessing to obtain physical activity variables from SQUASH questionnaire and clinical variables.
We have two data frames:
data_OS: data from all SQUASH respondents with survival
data. We will use this for assessing the correlation of physical
activity with survival, however melanoma patients only.data_tox: data from all SQUASH respondents with at
least one year follow-up. We will use this for assessing the correlation
of physical activity with severe irAEs, but only using the melanoma
patients.We will make some additional data frames for the subgroup analyses.
OS_melanoma: for the sensitivity analysis of OS
restricted to melanoma patients, we first use the subset melanoma
patients and include the variable LDH in our analyses.OS_palliative: for the sensitivity analysis of OS
restricted to irresectable patients.OS_PS: for the sensitivity analysis of OS restricted to
patient with ECOG performance status 0 or 1 (so excluding >1 and
unknown).tox_cICI: for the sensitivity analysis of SirAEs
restricted to patients who received ipilimumab+nivolumab.data_OS for the Fine and Gray
subdistribution hazards model, since this also includes patients who
started ICI less than one year before last follow-up.We double check if we have missing values and if we can incorporate them or we need to exclude them from the sensitivity analysis.
data_OS$OS_status <- as.numeric(as.character(recode(data_OS$OS_status, "no"="0", "yes"="1")))
Make variable with ECOG PS 0, 1 or >1. We will use this variable in sensitivity analyses.
data_OS$B_PS_tri <- factor(ifelse(data_OS$B_PS%in%"0","0",
ifelse(data_OS$B_PS%in%"1","1",
ifelse(data_OS$B_PS%in%c("2","3","4"),"2-3",
NA))))
table(data_OS$B_PS_tri, data_OS$B_PS, useNA="always")
##
## 0 1 2 3 4 unknown <NA>
## 0 126 0 0 0 0 0 0
## 1 0 101 0 0 0 0 0
## 2-3 0 0 16 2 0 0 0
## <NA> 0 0 0 0 0 6 0
data_tox$B_PS_tri <- factor(ifelse(data_tox$B_PS%in%"0","0",
ifelse(data_tox$B_PS%in%"1","1",
ifelse(data_tox$B_PS%in%c("2","3","4"),"2-3",
NA))))
table(data_tox$B_PS_tri, data_tox$B_PS, useNA="always")
##
## 0 1 2 3 4 unknown <NA>
## 0 103 0 0 0 0 0 0
## 1 0 86 0 0 0 0 0
## 2-3 0 0 13 1 0 0 0
## <NA> 0 0 0 0 0 6 0
Restricted cubic splines work with a certain number of knots: a hinge
point for the distribution which gets smoothed out by a cubic parameter.
We can specify the number and locations of knots or we can choose to let
the knot locations be chosen by our software. Usually, 3 to 5 knots
suffices. Consider that each knot ‘costs’ us a degree of freedom and
that too many knots may likely result in overfitting. Therefore, we will
use 3 knots here. We make use of the rcspline.eval()
function in the Hmisc package.
We will use the same knot locations in all models and sensitivity analyses based on data of all included patients.
Specify knots for MET-hours.
k.pos.totmet_hpwk <- rcspline.eval(data_OS$totmet_hpwk #variable to use splines for
, nk=3 #number of knots
, inclx=FALSE
, knots.only=T #only store knots
, type="ordinary"
, norm=2
, rpm=NULL
, pc=FALSE
, fractied=0.05)
Specify knots for MVPA-SL.
k.pos.MZvt_HWK <- rcspline.eval(data_OS$MZvt_HWK #variable to use splines for
, nk=3 #number of knots
, inclx=FALSE
, knots.only=T #only store knots
, type="ordinary"
, norm=2
, rpm=NULL
, pc=FALSE
, fractied=0.05)
Here, we will rerun the same analyses as primary analyses, but now with additional adjustment for ECOG performance status. ECOG performance status was missing for 6 patients (2.4%). Electronic patient files were checked, data seemed missing completely at random (MCAR). Given the low number of missing values, missingness under MCAR assumption and the analysis being a sensitivity analysis, a complete case analysis was considered appropriate.
To get some feeling, we calculate the proportion of patients with severe irAEs among each tertile of total PA.
tbl_summary(data_tox[, c("totmet_hpwk_tertiles", "T1_gr3_1y")]
, by="totmet_hpwk_tertiles"
, missing= "always"
, percent = "column")
| Characteristic | [0,51], N = 731 | (51,101], N = 691 | (101,371], N = 671 |
|---|---|---|---|
| T1_gr3_1y | 21 (29%) | 10 (14%) | 7 (10%) |
| Unknown | 0 | 0 | 0 |
| 1 n (%) | |||
Adjusted logistic regression for tertiles variable.
glm_tert.adj <- glm(formula = T1_gr3_1y ~ totmet_hpwk_tertiles + B_sex + B_age + B_tumor_type.1 + B_paladj + B_ther_prev + B_ther_cur_regrouped.1, family = binomial(link = "logit"), data = data_tox)
glm_tert.adj_PS <- glm(formula = T1_gr3_1y ~ totmet_hpwk_tertiles + B_sex + B_age + B_tumor_type.1 + B_paladj + B_ther_prev + B_ther_cur_regrouped.1 + B_PS_tri, family = binomial(link = "logit"), data = data_tox)
tbl_merge(list(
tbl_regression(glm_tert.adj, exponentiate=T),
tbl_regression(glm_tert.adj_PS, exponentiate=T)
),
tab_spanner = c("without ECOG PS", "with ECOG PS")
)
| Characteristic | without ECOG PS | with ECOG PS | ||||
|---|---|---|---|---|---|---|
| OR1 | 95% CI1 | p-value | OR1 | 95% CI1 | p-value | |
| totmet_hpwk_tertiles | ||||||
| [0,51] | — | — | — | — | ||
| (51,101] | 0.34 | 0.12, 0.90 | 0.035 | 0.32 | 0.10, 0.88 | 0.033 |
| (101,371] | 0.19 | 0.05, 0.55 | 0.004 | 0.20 | 0.06, 0.60 | 0.006 |
| B_sex | ||||||
| male | — | — | — | — | ||
| female | 1.45 | 0.60, 3.48 | 0.4 | 1.61 | 0.64, 4.05 | 0.3 |
| B_age | 1.04 | 1.00, 1.08 | 0.062 | 1.04 | 1.00, 1.08 | 0.081 |
| B_tumor_type.1 | ||||||
| melanoma | — | — | — | — | ||
| NSCLC | 1.28 | 0.16, 9.48 | 0.8 | 1.64 | 0.19, 13.0 | 0.6 |
| RCC | 0.36 | 0.09, 1.27 | 0.12 | 0.34 | 0.08, 1.27 | 0.12 |
| other | 0.85 | 0.14, 4.70 | 0.9 | 1.15 | 0.18, 7.16 | 0.9 |
| B_paladj | ||||||
| palliative | — | — | — | — | ||
| adjuvant | 0.77 | 0.19, 3.35 | 0.7 | 1.09 | 0.23, 5.69 | >0.9 |
| B_ther_prev | ||||||
| no | — | — | — | — | ||
| yes | 0.24 | 0.03, 1.11 | 0.10 | 0.23 | 0.03, 1.12 | 0.10 |
| B_ther_cur_regrouped.1 | ||||||
| anti-PD-(L)1 | — | — | — | — | ||
| cICI | 14.5 | 3.43, 72.7 | <0.001 | 19.8 | 4.29, 115 | <0.001 |
| ICI + chemo/targeted therapy | 1.17 | 0.15, 8.55 | 0.9 | 1.26 | 0.15, 9.63 | 0.8 |
| B_PS_tri | ||||||
| 0 | — | — | ||||
| 1 | 0.98 | 0.36, 2.65 | >0.9 | |||
| 2-3 | 1.04 | 0.16, 6.32 | >0.9 | |||
| 1 OR = Odds Ratio, CI = Confidence Interval | ||||||
Adjusted logistic regression for continuous linear variable.
glm_lin.adj <- glm(formula = T1_gr3_1y ~ totmet_hpwk + B_sex + B_age + B_tumor_type.1 + B_paladj + B_ther_prev + B_ther_cur_regrouped.1 + B_PS_tri, family = binomial(link = "logit"), data = data_tox)
# tbl_regression(glm_lin.adj, exponentiate=T)
Assess linearity of totmet_hpwk and B_age
with outcome.
fit <- glm(formula = T1_gr3_1y ~ totmet_hpwk , family = binomial(link = "logit"), data = data_tox)
residual_lp_binned(data_tox, fit = fit, title = "totmet_hpwk")
fit <- glm(formula = T1_gr3_1y ~ B_age , family = binomial(link = "logit"), data = data_tox)
residual_lp_binned(data_tox, fit = fit, title = "age")
Assess whether residuals are normally distributed.
plot(glm_lin.adj, which = 2)
Check homoscedasticity.
residual_lp_binned(data_tox, glm_lin.adj)
residual_lp_binned(data_tox, glm_tert.adj)
Conduct survival analysis using splines regression by the
prespecified knots. We make use of the rcs() function in
package rms which provides restricted cubic splines.
glm_rcs.adj <- glm(formula = T1_gr3_1y ~ rms::rcs(totmet_hpwk, k.pos.totmet_hpwk) + B_sex + B_age + B_tumor_type.1 + B_paladj + B_ther_prev + B_ther_cur_regrouped.1, family = binomial(link = "logit"), data = data_tox)
tbl_regression(glm_rcs.adj, exponentiate=T)
| Characteristic | OR1 | 95% CI1 | p-value |
|---|---|---|---|
| totmet_hpwk | |||
| rms::rcs(totmet_hpwk, k.pos.totmet_hpwk)totmet_hpwk | 0.97 | 0.95, 0.99 | 0.005 |
| rms::rcs(totmet_hpwk, k.pos.totmet_hpwk)totmet_hpwk' | 1.03 | 1.00, 1.05 | 0.023 |
| B_sex | |||
| male | — | — | |
| female | 1.52 | 0.63, 3.63 | 0.3 |
| B_age | 1.03 | 1.00, 1.07 | 0.088 |
| B_tumor_type.1 | |||
| melanoma | — | — | |
| NSCLC | 1.15 | 0.15, 8.15 | 0.9 |
| RCC | 0.36 | 0.09, 1.27 | 0.12 |
| other | 0.76 | 0.12, 4.25 | 0.8 |
| B_paladj | |||
| palliative | — | — | |
| adjuvant | 0.70 | 0.17, 3.03 | 0.6 |
| B_ther_prev | |||
| no | — | — | |
| yes | 0.28 | 0.04, 1.27 | 0.14 |
| B_ther_cur_regrouped.1 | |||
| anti-PD-(L)1 | — | — | |
| cICI | 11.5 | 2.79, 55.6 | 0.001 |
| ICI + chemo/targeted therapy | 1.17 | 0.15, 8.10 | 0.9 |
| 1 OR = Odds Ratio, CI = Confidence Interval | |||
We can check whether using restricted cubic splines improves the model fit (statistically). A p-value <0.05 indicates that the model fit is statistically different.
We can check whether totmet_hpwk improves the model fit
at all (compare a model with splines for totmet_hpwk
(Model 1) to a model without totmet_hpwk
(Model 2)).
anova(glm_rcs.adj,update(glm_rcs.adj,.~.-rms::rcs(totmet_hpwk, k.pos.totmet_hpwk)), test = "Chisq")
## Analysis of Deviance Table
##
## Model 1: T1_gr3_1y ~ rms::rcs(totmet_hpwk, k.pos.totmet_hpwk) + B_sex +
## B_age + B_tumor_type.1 + B_paladj + B_ther_prev + B_ther_cur_regrouped.1
## Model 2: T1_gr3_1y ~ B_sex + B_age + B_tumor_type.1 + B_paladj + B_ther_prev +
## B_ther_cur_regrouped.1
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 197 148.79
## 2 199 157.69 -2 -8.905 0.01165 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Than, we check whether using splines for totmet_hpwk
gives a better fit than adding it as a linear variable (compare a model
with splines for totmet_hpwk (Model 1) to a model
with totmet_hpwk as is (Model 2)).
anova(glm_rcs.adj,update(glm_rcs.adj,.~.-rms::rcs(totmet_hpwk, k.pos.totmet_hpwk)+totmet_hpwk), test = "Chisq")
## Analysis of Deviance Table
##
## Model 1: T1_gr3_1y ~ rms::rcs(totmet_hpwk, k.pos.totmet_hpwk) + B_sex +
## B_age + B_tumor_type.1 + B_paladj + B_ther_prev + B_ther_cur_regrouped.1
## Model 2: T1_gr3_1y ~ B_sex + B_age + B_tumor_type.1 + B_paladj + B_ther_prev +
## B_ther_cur_regrouped.1 + totmet_hpwk
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 197 148.79
## 2 198 154.24 -1 -5.4503 0.01956 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We want to visualise the Odds ratio (OR) for developing a severe irAE
within 1 year over the trajectory of our variable of interest
(totmet_hpwk). To do this, we will make a dummy dataset to
predict the OR for certain values of totmet_hpwk. We will
also obtain the 95% confidence interval (CI).
totmet_hpwk.
totmet_hpwk.if(!is.null(glm_rcs.adj$na.action)){
newdf <- data_tox[-c(glm_rcs.adj$na.action),]
} else {
newdf <- data_tox
}
newdf$totmet_hpwk <- c(seq(min(data_tox$totmet_hpwk),max(data_tox$totmet_hpwk),length=(nrow(newdf)-length(k.pos.totmet_hpwk))), k.pos.totmet_hpwk)
Now we make a model matrix, which is needed in the subsequent steps.
mm <- model.matrix(glm_rcs.adj, data=newdf)
mm[,!colnames(mm) %in% colnames(mm)[grep("totmet_hpwk",colnames(mm))]] <- 0 # set all variables not about totmet_hpwk to 0
For each value of our sequence of totmet_hpwk, we obtain
its predicted coefficient (=eta) and predicted standard
error (=pse). With these, we compute the predicted 95%
conficence intervalls (plo and phi for lower
and upper CI). We store these in a data frame called
est.
est <- data.frame(totmet_hpwk=newdf$totmet_hpwk,eta=mm%*%coef(glm_rcs.adj)) #predicted coefficient
pvar <- diag(mm %*% tcrossprod(vcov(glm_rcs.adj),mm)) #predicted variance
est <- data.frame(est,pse=sqrt(pvar)) #predicted standard error
est <- with(est,
data.frame(est,
plo=eta-qnorm(0.975)*pse, #predicted lower 95%CI bound
phi=eta+qnorm(0.975)*pse)) #predicted upper 95%CI bound
For Cox regression and logistic regression we need to exponentiate the coefficient and confidence bounds to obtain OR and 95% CI.
est[,c("eta","plo","phi")] <- exp(est[,c("eta","plo","phi")])
We remove duplicates (e.g. if we have two of the same values for
totmet_hpwk) and sort on totmet_hpwk.
est <- est[!duplicated(est),] #remove duplicates
est <- est[order(est$totmet_hpwk),] #sort based on var1 order
Now, we can finally make the actual plot.
geom_ribbon() to obtain the 95%CI.geom_hline() to draw a line at the reference
value 1.geom_line() to draw the line for the HR.geom_point() to draw the location of the
splines.geom_rug() to visualise the instances of
var1 on the x-axis.scale_y_log10() to transform the y-axis by a 10
base logarithm and specify the breaks.p_splines_tox.OR_MET <- ggplot(data = est
)+ geom_ribbon(aes(x=totmet_hpwk, ymin = plo, ymax = phi) #color 95%CI
, colour = rgb(248,136,44,maxColorValue = 255,alpha=100)
, fill=rgb(252,211,178,maxColorValue = 255,alpha=100)
, alpha=0.4
)+ geom_hline(yintercept=1 #add horizontal line at 1
, color="dark grey"
, linetype="dashed"
, linewidth=1
)+ geom_line(aes(x=totmet_hpwk,y=eta) #add estimated HR
, color = "coral3"
, linewidth = 1
)+ geom_point(data=est[est$totmet_hpwk%in%k.pos.totmet_hpwk,] #to plot the places of the knots
, aes(x=totmet_hpwk, y=eta)
)+ geom_rug(data = data_tox, aes(x=totmet_hpwk) #add rugs representing real observations for totmet_hpwk
, sides = "b"
, length = grid::unit(0.02, "npc")
, color = "grey"
, alpha = .3
)+ scale_x_continuous(expand = c(0, 0) #set axes directly at 0
)+ scale_y_log10(breaks = unique(c(0.01,seq(0,1,.2)/10,seq(0,1,.2), seq(0,1,.2)*10))
# )+ annotation_logticks(base = 10, sides = "l", outside=F , scaled = T, color = "black",alpha = .5
)+ coord_cartesian(#xlim = c(0, 201) #set x-axis
ylim = c(0.01,10) #set y-axis
# )+ ggtitle("" #add title
)+ xlab("total PA (MET-hours/week)" #add x-axis label
)+ ylab("Odds ratio for severe irAE" #add y-axis label
)+ theme_light( #add/alter theme
)+ theme(plot.margin = margin(0.5,3,0.5,0.5, unit = "char")
)
p_splines_tox.OR_MET
Alternatively, we can visualize the predicted probabilities given a patient in whom only the variable of interest changes.
newdf <- with(data_tox,
expand.grid(totmet_hpwk=c(seq(min(totmet_hpwk),max(totmet_hpwk),length=300), k.pos.totmet_hpwk)
, B_sex = levels(B_sex)[1] #reference level for sex
, B_age = round(median(B_age), digits = 0)
, B_tumor_type.1 = levels(B_tumor_type.1)[1]
, B_paladj = levels(B_paladj)[1]
, B_ther_prev = levels(B_ther_prev)[1]
, B_ther_cur_regrouped.1 = levels(B_ther_cur_regrouped.1)[1]
, B_PS_tri = levels(B_PS_tri)[1]
))
newdf$prob <- predict(glm_rcs.adj, newdata = newdf, type = "response")
p_splines_tox.prob <- ggplot(data = newdf
)+ geom_hline(yintercept=1 #add horizontal line at 1
, color="dark grey"
, linetype="dashed"
, linewidth=1
)+ geom_hline(yintercept=0 #add horizontal line at 0
, color="dark grey"
, linetype="dashed"
, linewidth=1
)+ geom_line(aes(x=totmet_hpwk,y=prob) #add estimated HR
, color = "coral3"
, linewidth = 1
)+ geom_point(data=newdf[newdf$totmet_hpwk%in%k.pos.totmet_hpwk,] #to plot the places of the knots
, aes(x=totmet_hpwk, y=prob)
)+ geom_rug(data = data_tox, aes(x=totmet_hpwk) #add rugs representing real observations for totmet_hpwk
, sides = "b"
, length = grid::unit(0.02, "npc")
, color = "grey"
, alpha = .3
)+ scale_x_continuous(expand = c(0, 0) #set axes directly at 0
)+ coord_cartesian(#xlim = c(0, 201) #set x-axis
ylim = c(0.0,1.0) #set y-axis
)+ labs(x = "total PA (MET-hours/week)" #add x-axis label
,y ="Probability of severe irAE" #add y-axis label
,title = "probability of SirAE for a fictitious patient with differing total PA"
,caption = "64 year old male with ECOG performance status 0 and melanoma treated with first line anti-PD-(L)1 in palliative setting"
)+ theme_bw( #add/alter theme
)
p_splines_tox.prob
Adjusted logistic regression for tertiles variable.
glm_tert.adj <- glm(formula = T1_gr3_1y ~ MZvt_HWK_tertiles + B_sex + B_age + B_tumor_type.1 + B_paladj + B_ther_prev + B_ther_cur_regrouped.1, family = binomial(link = "logit"), data = data_tox)
glm_tert.adj_PS <- glm(formula = T1_gr3_1y ~ MZvt_HWK_tertiles + B_sex + B_age + B_tumor_type.1 + B_paladj + B_ther_prev + B_ther_cur_regrouped.1 + B_PS_tri, family = binomial(link = "logit"), data = data_tox)
tbl_merge(list(
tbl_regression(glm_tert.adj, exponentiate=T),
tbl_regression(glm_tert.adj_PS, exponentiate=T)
),
tab_spanner = c("without ECOG PS", "with ECOG PS")
)
| Characteristic | without ECOG PS | with ECOG PS | ||||
|---|---|---|---|---|---|---|
| OR1 | 95% CI1 | p-value | OR1 | 95% CI1 | p-value | |
| MZvt_HWK_tertiles | ||||||
| [0,1.5] | — | — | — | — | ||
| (1.5,6.5] | 0.42 | 0.16, 1.06 | 0.073 | 0.40 | 0.14, 1.03 | 0.064 |
| (6.5,38.5] | 0.11 | 0.03, 0.38 | 0.001 | 0.12 | 0.03, 0.42 | 0.002 |
| B_sex | ||||||
| male | — | — | — | — | ||
| female | 1.11 | 0.45, 2.68 | 0.8 | 1.29 | 0.51, 3.25 | 0.6 |
| B_age | 1.05 | 1.01, 1.10 | 0.009 | 1.05 | 1.01, 1.10 | 0.017 |
| B_tumor_type.1 | ||||||
| melanoma | — | — | — | — | ||
| NSCLC | 1.08 | 0.12, 8.28 | >0.9 | 1.40 | 0.15, 11.5 | 0.8 |
| RCC | 0.30 | 0.07, 1.09 | 0.074 | 0.28 | 0.07, 1.07 | 0.070 |
| other | 0.80 | 0.13, 4.59 | 0.8 | 1.07 | 0.16, 6.88 | >0.9 |
| B_paladj | ||||||
| palliative | — | — | — | — | ||
| adjuvant | 0.85 | 0.20, 3.83 | 0.8 | 1.25 | 0.26, 6.96 | 0.8 |
| B_ther_prev | ||||||
| no | — | — | — | — | ||
| yes | 0.30 | 0.04, 1.42 | 0.2 | 0.29 | 0.04, 1.42 | 0.2 |
| B_ther_cur_regrouped.1 | ||||||
| anti-PD-(L)1 | — | — | — | — | ||
| cICI | 15.3 | 3.53, 80.4 | <0.001 | 21.0 | 4.42, 129 | <0.001 |
| ICI + chemo/targeted therapy | 1.45 | 0.18, 11.3 | 0.7 | 1.60 | 0.18, 13.0 | 0.7 |
| B_PS_tri | ||||||
| 0 | — | — | ||||
| 1 | 0.96 | 0.35, 2.61 | >0.9 | |||
| 2-3 | 1.20 | 0.19, 7.09 | 0.8 | |||
| 1 OR = Odds Ratio, CI = Confidence Interval | ||||||
Adjusted logistic regression for continuous linear variable.
glm_lin.adj <- glm(formula = T1_gr3_1y ~ MZvt_HWK + B_sex + B_age + B_tumor_type.1 + B_paladj + B_ther_prev + B_ther_cur_regrouped.1 + B_PS_tri, family = binomial(link = "logit"), data = data_tox)
# tbl_regression(glm_lin.adj, exponentiate=T)
Check assumptions:
Assess linearity of MZvt_HWK and B_age with
outcome.
fit <- glm(formula = T1_gr3_1y ~ MZvt_HWK , family = binomial(link = "logit"), data = data_tox)
residual_lp_binned(data_tox, fit = fit, title = "MZvt_HWK")
fit <- glm(formula = T1_gr3_1y ~ B_age , family = binomial(link = "logit"), data = data_tox)
residual_lp_binned(data_tox, fit = fit, title = "age")
Assess whether residuals are normally distributed.
plot(glm_lin.adj, which = 2)
Check homoscedasticity.
residual_lp_binned(data_tox, glm_lin.adj)
residual_lp_binned(data_tox, glm_tert.adj)
Conduct survival analysis using splines regression by the
prespecified knots. We make use of the rcs() function in
package rms which provides restricted cubic splines.
glm_rcs.adj <- glm(formula = T1_gr3_1y ~ rms::rcs(MZvt_HWK, k.pos.MZvt_HWK) + B_sex + B_age + B_tumor_type.1 + B_paladj + B_ther_prev + B_ther_cur_regrouped.1 + B_PS_tri, family = binomial(link = "logit"), data = data_tox)
tbl_regression(glm_rcs.adj, exponentiate=T)
| Characteristic | OR1 | 95% CI1 | p-value |
|---|---|---|---|
| MZvt_HWK | |||
| rms::rcs(MZvt_HWK, k.pos.MZvt_HWK)MZvt_HWK | 0.80 | 0.62, 1.02 | 0.073 |
| rms::rcs(MZvt_HWK, k.pos.MZvt_HWK)MZvt_HWK' | 1.21 | 0.72, 1.93 | 0.4 |
| B_sex | |||
| male | — | — | |
| female | 1.23 | 0.49, 3.08 | 0.7 |
| B_age | 1.05 | 1.01, 1.10 | 0.019 |
| B_tumor_type.1 | |||
| melanoma | — | — | |
| NSCLC | 1.35 | 0.15, 11.0 | 0.8 |
| RCC | 0.27 | 0.06, 1.02 | 0.060 |
| other | 1.04 | 0.16, 6.59 | >0.9 |
| B_paladj | |||
| palliative | — | — | |
| adjuvant | 1.16 | 0.24, 6.37 | 0.9 |
| B_ther_prev | |||
| no | — | — | |
| yes | 0.27 | 0.03, 1.33 | 0.15 |
| B_ther_cur_regrouped.1 | |||
| anti-PD-(L)1 | — | — | |
| cICI | 19.0 | 4.06, 114 | <0.001 |
| ICI + chemo/targeted therapy | 1.47 | 0.17, 11.8 | 0.7 |
| B_PS_tri | |||
| 0 | — | — | |
| 1 | 0.96 | 0.35, 2.61 | >0.9 |
| 2-3 | 1.11 | 0.17, 6.65 | >0.9 |
| 1 OR = Odds Ratio, CI = Confidence Interval | |||
We can check whether using restricted cubic splines improves the model fit (statistically). A p-value <0.05 indicates that the model fit is statistically different.
We can check whether MZvt_HWK improves the model fit at
all (compare a model with splines for MZvt_HWK (Model
1) to a model without MZvt_HWK (Model
2)).
anova(glm_rcs.adj,update(glm_rcs.adj,.~.-rms::rcs(MZvt_HWK, k.pos.MZvt_HWK)), test = "Chisq")
## Analysis of Deviance Table
##
## Model 1: T1_gr3_1y ~ rms::rcs(MZvt_HWK, k.pos.MZvt_HWK) + B_sex + B_age +
## B_tumor_type.1 + B_paladj + B_ther_prev + B_ther_cur_regrouped.1 +
## B_PS_tri
## Model 2: T1_gr3_1y ~ B_sex + B_age + B_tumor_type.1 + B_paladj + B_ther_prev +
## B_ther_cur_regrouped.1 + B_PS_tri
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 189 139.07
## 2 191 149.91 -2 -10.84 0.004427 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Than, we check whether using splines for MZvt_HWK gives
a better fit than adding it as a linear variable (compare a model with
splines for MZvt_HWK (Model 1) to a model with
MZvt_HWK as is (Model 2)).
anova(glm_rcs.adj,update(glm_rcs.adj,.~.-rms::rcs(MZvt_HWK, k.pos.MZvt_HWK)+MZvt_HWK), test = "Chisq")
## Analysis of Deviance Table
##
## Model 1: T1_gr3_1y ~ rms::rcs(MZvt_HWK, k.pos.MZvt_HWK) + B_sex + B_age +
## B_tumor_type.1 + B_paladj + B_ther_prev + B_ther_cur_regrouped.1 +
## B_PS_tri
## Model 2: T1_gr3_1y ~ B_sex + B_age + B_tumor_type.1 + B_paladj + B_ther_prev +
## B_ther_cur_regrouped.1 + B_PS_tri + MZvt_HWK
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 189 139.07
## 2 190 139.62 -1 -0.54822 0.459
if(!is.null(glm_rcs.adj$na.action)){
newdf <- data_tox[-c(glm_rcs.adj$na.action),]
} else {
newdf <- data_tox
}
newdf$MZvt_HWK <- c(seq(min(data_tox$MZvt_HWK),max(data_tox$MZvt_HWK),length=(nrow(newdf)-length(k.pos.MZvt_HWK))), k.pos.MZvt_HWK)
Now we make a model matrix, which is needed in the subsequent steps.
mm <- model.matrix(glm_rcs.adj, data=newdf)
mm[,!colnames(mm) %in% colnames(mm)[grep("MZvt_HWK",colnames(mm))]] <- 0 # set all variables not about MZvt_HWK to 0
For each value of our sequence of MZvt_HWK, we obtain
its predicted coefficient (=eta) and predicted standard
error (=pse). With these, we compute the predicted 95%
conficence intervalls (plo and phi for lower
and upper CI). We store these in a data frame called
est.
est <- data.frame(MZvt_HWK=newdf$MZvt_HWK,eta=mm%*%coef(glm_rcs.adj)) #predicted coefficient
pvar <- diag(mm %*% tcrossprod(vcov(glm_rcs.adj),mm)) #predicted variance
est <- data.frame(est,pse=sqrt(pvar)) #predicted standard error
est <- with(est,
data.frame(est,
plo=eta-qnorm(0.975)*pse, #predicted lower 95%CI bound
phi=eta+qnorm(0.975)*pse)) #predicted upper 95%CI bound
For Cox regression and logistic regression we need to exponentiate the coefficient and confidence bounds to obtain OR and 95% CI.
est[,c("eta","plo","phi")] <- exp(est[,c("eta","plo","phi")])
We remove duplicates (e.g. if we have two of the same values for
MZvt_HWK) and sort on MZvt_HWK.
est <- est[!duplicated(est),] #remove duplicates
est <- est[order(est$MZvt_HWK),] #sort based on var1 order
p_splines_tox.OR_MVPASL <- ggplot(data = est
)+ geom_ribbon(aes(x=MZvt_HWK, ymin = plo, ymax = phi) #color 95%CI
, colour = rgb(248,136,44,maxColorValue = 255,alpha=100)
, fill=rgb(252,211,178,maxColorValue = 255,alpha=100)
, alpha=0.4
)+ geom_hline(yintercept=1 #add horizontal line at 1
, color="dark grey"
, linetype="dashed"
, linewidth=1
)+ geom_line(aes(x=MZvt_HWK,y=eta) #add estimated HR
, color = "coral3"
, linewidth = 1
)+ geom_point(data=est[est$MZvt_HWK%in%k.pos.MZvt_HWK,] #to plot the places of the knots
, aes(x=MZvt_HWK, y=eta)
)+ geom_rug(data = data_tox, aes(x=MZvt_HWK) #add rugs representing real observations for MZvt_HWK
, sides = "b"
, length = grid::unit(0.02, "npc")
, color = "grey"
, alpha = .3
)+ scale_x_continuous(expand = c(0, 0) #set axes directly at 0
)+ scale_y_log10(breaks = unique(c(0.01,seq(0,1,.2)/10,seq(0,1,.2), seq(0,1,.2)*10))
)+ coord_cartesian(#xlim = c(0, 201) #set x-axis
ylim = c(0.01,10) #set y-axis
# )+ ggtitle("" #add title
)+ xlab("MVPA-SL (hours/week)" #add x-axis label
)+ ylab("Odds ratio for severe irAE" #add y-axis label
)+ theme_light( #add/alter theme
)+ theme(plot.margin = margin(0.5,3,0.5,0.5, unit = "char")
)
p_splines_tox.OR_MVPASL
Alternatively, we can visualize the predicted probabilities given a patient in whom only the variable of interest changes.
newdf <- with(data_tox,
expand.grid(MZvt_HWK=c(seq(min(MZvt_HWK),max(MZvt_HWK),length=300), k.pos.MZvt_HWK)
, B_sex = levels(B_sex)[1] #reference level for sex
, B_age = round(median(B_age), digits = 0)
, B_tumor_type.1 = levels(B_tumor_type.1)[1]
, B_paladj = levels(B_paladj)[1]
, B_ther_prev = levels(B_ther_prev)[1]
, B_ther_cur_regrouped.1 = levels(B_ther_cur_regrouped.1)[1]
, B_PS_tri = levels(B_PS_tri)[1]
))
newdf$prob <- predict(glm_rcs.adj, newdata = newdf, type = "response")
p_splines_tox.prob <- ggplot(data = newdf
)+ geom_hline(yintercept=1 #add horizontal line at 1
, color="dark grey"
, linetype="dashed"
, linewidth=1
)+ geom_hline(yintercept=0 #add horizontal line at 0
, color="dark grey"
, linetype="dashed"
, linewidth=1
)+ geom_line(aes(x=MZvt_HWK,y=prob) #add estimated HR
, color = "coral3"
, linewidth = 1
)+ geom_point(data=newdf[newdf$MZvt_HWK%in%k.pos.MZvt_HWK,] #to plot the places of the knots
, aes(x=MZvt_HWK, y=prob)
)+ geom_rug(data = data_tox, aes(x=MZvt_HWK) #add rugs representing real observations for MZvt_HWK
, sides = "b"
, length = grid::unit(0.02, "npc")
, color = "grey"
, alpha = .3
)+ scale_x_continuous(expand = c(0, 0) #set axes directly at 0
)+ coord_cartesian(#xlim = c(0, 201) #set x-axis
ylim = c(0.0,1.0) #set y-axis
)+ labs(x = "MVPA-SL (hours/week)" #add x-axis label
,y ="Probability of severe irAE" #add y-axis label
,title = "probability of SirAE for a fictitious patient with differing MVPA-SL"
,caption = "64 year old male with ECOG performance status 0 and melanoma treated with first line anti-PD-(L)1 in palliative setting"
)+ theme_bw( #add/alter theme
)
p_splines_tox.prob
Make Supplementary Figure 1.
suppl_figure1 <- ggarrange(p_splines_tox.OR_MET, p_splines_tox.OR_MVPASL
,labels = c("A","B")
,ncol=2, nrow=1
,widths = c(1,1), heights = c(1,1)
,common.legend=F
)
suppl_figure1
Save as PDF.
Make tox_cICI.
tox_cICI <- data_tox[data_tox$B_ther_cur_regrouped.1%in%"cICI",]
tox_cICI$B_tumor_type.1 <- droplevels(tox_cICI$B_tumor_type.1)
table(tox_cICI$B_tumor_type.1, useNA = "always")
##
## melanoma RCC other <NA>
## 25 23 1 0
Notably, only one patient in our cohort developed his/her first severe irAE after one year. This patient is thus regarded as without severe irAE <1 year.
table(tox_cICI$T1_gr3_1y, useNA = "always")
##
## no yes <NA>
## 27 22 0
Specify data frame to use
data <- tox_cICI
Relabel variables and add units that you want to use in your table (all formatting to let the table look nicely):
label(data$B_sex) <- "sex"
label(data$B_age) <- "age"
units(data$B_age) <- "years"
label(data$B_PS) <- "ECOG performance status"
label(data$B_tumor_type.1) <- "tumor type"
label(data$B_tumor_stage_dich) <- "stage"
label(data$B_paladj) <- "treatment intent"
label(data$B_ther_prev) <- "previous systemic therapy"
label(data$B_ther_cur_regrouped.1) <- "therapy"
label(data$T1_gr3_1y) <- "severe irAE within 1 year"
label(data$totmet_hpwk_tertiles) <- "total PA"
label(data$totmet_hpwk) <- "total PA"
units(data$totmet_hpwk) <- "hours/week"
label(data$MZvt_HWK) <- "MVPA-SL"
units(data$MZvt_HWK) <- "hours/week"
Make the actual table:
~ B_sex + B_age + ... + B_ther_cur_regrouped.1 defines
the rows;| totmet_hpwk_tertiles defines the columns;table1(~ totmet_hpwk + MZvt_HWK + B_sex + B_age + B_PS + B_tumor_type.1 + B_tumor_stage_dich + B_paladj + B_ther_prev + B_ther_cur_regrouped.1| totmet_hpwk_tertiles,
data = data,
overall = "Overall",
# caption = "Table 1: baseline characteristics of combined ipilimumab",
# footnote = "RCC: renal cell carcinoma",
topclass = "Rtable1-zebra",
render.categorical = my.render.cat,
render.continuous = my.render.cont,#c(.="median [Q1, Q3]"),
render.missing = my.render.missing,
droplevels = TRUE)
| [0,51] (N=16) |
(51,101] (N=16) |
(101,371] (N=17) |
Overall (N=49) |
|
|---|---|---|---|---|
| total PA (hours/week) | ||||
| median [Q1-Q3] | 24.6 [8.3-35.5] | 76.1 [64.5-88.3] | 126.3 [119.4-178.7] | 77.8 [35.6-119.4] |
| mean (SD) | 22.7 (16.1) | 76.1 (15.6) | 156.0 (67.7) | 86.4 (69.2) |
| MVPA-SL (hours/week) | ||||
| median [Q1-Q3] | 0.0 [0.0-1.9] | 2.2 [0.5-5.3] | 7.0 [4.5-15.0] | 3.0 [0.0-7.0] |
| mean (SD) | 1.0 (1.7) | 3.7 (4.2) | 10.2 (7.3) | 5.1 (6.3) |
| sex | ||||
| male | 11 (68.8%) | 11 (68.8%) | 12 (70.6%) | 34 (69.4%) |
| female | 5 (31.2%) | 5 (31.2%) | 5 (29.4%) | 15 (30.6%) |
| age (years) | ||||
| median [Q1-Q3] | 65.5 [56.8-72.3] | 62.5 [53.0-74.3] | 60.0 [57.0-67.0] | 62.0 [55.0-72.0] |
| mean (SD) | 61.4 (15.7) | 61.1 (13.9) | 62.4 (8.9) | 61.6 (12.8) |
| ECOG performance status | ||||
| 0 | 8 (50.0%) | 5 (31.2%) | 9 (52.9%) | 22 (44.9%) |
| 1 | 4 (25.0%) | 11 (68.8%) | 6 (35.3%) | 21 (42.9%) |
| 2 | 3 (18.8%) | 0 (0.0%) | 1 (5.9%) | 4 (8.2%) |
| 3 | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) |
| 4 | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) |
| unknown | 1 (6.2%) | 0 (0.0%) | 1 (5.9%) | 2 (4.1%) |
| tumor type | ||||
| melanoma | 8 (50.0%) | 8 (50.0%) | 9 (52.9%) | 25 (51.0%) |
| RCC | 7 (43.8%) | 8 (50.0%) | 8 (47.1%) | 23 (46.9%) |
| other | 1 (6.2%) | 0 (0.0%) | 0 (0.0%) | 1 (2.0%) |
| stage | ||||
| III | 0 (0.0%) | 0 (0.0%) | 1 (5.9%) | 1 (2.0%) |
| IV | 16 (100.0%) | 16 (100.0%) | 16 (94.1%) | 48 (98.0%) |
| other | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) |
| treatment intent | ||||
| palliative | 16 (100.0%) | 16 (100.0%) | 17 (100.0%) | 49 (100.0%) |
| adjuvant | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) |
| previous systemic therapy | ||||
| no | 15 (93.8%) | 15 (93.8%) | 16 (94.1%) | 46 (93.9%) |
| yes | 1 (6.2%) | 1 (6.2%) | 1 (5.9%) | 3 (6.1%) |
| therapy | ||||
| anti-PD-(L)1 | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) |
| cICI | 16 (100.0%) | 16 (100.0%) | 17 (100.0%) | 49 (100.0%) |
| ICI + chemo/targeted therapy | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) |
Indeed we selected only patients who received cICI
Check which type of tumor the “other” is.
tox_cICI[tox_cICI$B_tumor_type.1%in%"other",c("record_id","B_tumor_type","B_tumor_type_other")]
Adjusted logistic regression for tertiles variable.
glm_tert.adj <- glm(formula = T1_gr3_1y ~ totmet_hpwk_tertiles + B_sex + B_age,
family = binomial(link = "logit"),
data = tox_cICI)
tbl_regression(glm_tert.adj, exponentiate=T)
| Characteristic | OR1 | 95% CI1 | p-value |
|---|---|---|---|
| totmet_hpwk_tertiles | |||
| [0,51] | — | — | |
| (51,101] | 0.46 | 0.10, 1.89 | 0.3 |
| (101,371] | 0.54 | 0.13, 2.14 | 0.4 |
| B_sex | |||
| male | — | — | |
| female | 1.67 | 0.48, 5.97 | 0.4 |
| B_age | 1.01 | 0.97, 1.06 | 0.6 |
| 1 OR = Odds Ratio, CI = Confidence Interval | |||
Adjusted logistic regression for continuous linear variable.
glm_lin.adj <- glm(formula = T1_gr3_1y ~ totmet_hpwk + B_sex + B_age,
family = binomial(link = "logit"),
data = tox_cICI)
tbl_regression(glm_lin.adj, exponentiate=T)
| Characteristic | OR1 | 95% CI1 | p-value |
|---|---|---|---|
| totmet_hpwk | 1.00 | 0.99, 1.01 | >0.9 |
| B_sex | |||
| male | — | — | |
| female | 1.64 | 0.48, 5.81 | 0.4 |
| B_age | 1.01 | 0.97, 1.06 | 0.6 |
| 1 OR = Odds Ratio, CI = Confidence Interval | |||
Check assumptions: - Observations are independent: - is the case by design. - Relationships of explanatory variables with outcome should be linear: - Residual vs Fitted(=predictor) plot per (numeric) covariate. - Should show a horizontal band without curvature/pattern. - Since our outcome is binary, we need to bin the residuals. - Residuals should be normally distributed: - QQ-plot: points should lie on one diagonal line. - Homoscedasticity=homogeneity of variances: - The spread of the residuals does not depend on the linear predictor. - Residual vs Fitted(=linear predictor) plot. - Should show a horizontal band without curvature/pattern.
Assess linearity of totmet_hpwk and B_age
with outcome.
fit <- glm(formula = T1_gr3_1y ~ totmet_hpwk , family = binomial(link = "logit"), data = tox_cICI)
residual_lp_binned(tox_cICI, fit = fit, title = "totmet_hpwk")
fit <- glm(formula = T1_gr3_1y ~ B_age , family = binomial(link = "logit"), data = tox_cICI)
residual_lp_binned(tox_cICI, fit = fit, title = "age")
Assess whether residuals are normally distributed.
plot(glm_lin.adj, which = 2)
Check homoscedasticity.
residual_lp_binned(tox_cICI, glm_lin.adj)
residual_lp_binned(tox_cICI, glm_tert.adj)
Initial regression.
glm_rcs.adj <- glm(formula = T1_gr3_1y ~ rms::rcs(totmet_hpwk, k.pos.totmet_hpwk) + B_sex + B_age,
family = binomial(link = "logit"),
data = tox_cICI)
tbl_regression(glm_rcs.adj, exponentiate=T)
| Characteristic | OR1 | 95% CI1 | p-value |
|---|---|---|---|
| totmet_hpwk | |||
| rms::rcs(totmet_hpwk, k.pos.totmet_hpwk)totmet_hpwk | 0.98 | 0.96, 1.01 | 0.2 |
| rms::rcs(totmet_hpwk, k.pos.totmet_hpwk)totmet_hpwk' | 1.02 | 1.00, 1.06 | 0.13 |
| B_sex | |||
| male | — | — | |
| female | 1.61 | 0.45, 5.90 | 0.5 |
| B_age | 1.01 | 0.96, 1.06 | 0.7 |
| 1 OR = Odds Ratio, CI = Confidence Interval | |||
We can check whether using restricted cubic splines improves the model fit (statistically). A p-value <0.05 indicates that the model fit is statistically different.
We can check whether totmet_hpwk improves the model fit
at all (compare a model with splines for totmet_hpwk
(Model 1) to a model without totmet_hpwk
(Model 2)).
anova(glm_rcs.adj,update(glm_rcs.adj,.~.-rms::rcs(totmet_hpwk, k.pos.totmet_hpwk)), test = "Chisq")
## Analysis of Deviance Table
##
## Model 1: T1_gr3_1y ~ rms::rcs(totmet_hpwk, k.pos.totmet_hpwk) + B_sex +
## B_age
## Model 2: T1_gr3_1y ~ B_sex + B_age
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 44 63.846
## 2 46 66.469 -2 -2.6225 0.2695
Than, we check whether using splines for totmet_hpwk
gives a better fit than adding it as a linear variable (compare a model
with splines for totmet_hpwk (Model 1) to a model
with totmet_hpwk as is (Model 2)).
anova(glm_rcs.adj,update(glm_rcs.adj,.~.-rms::rcs(totmet_hpwk, k.pos.totmet_hpwk)+totmet_hpwk), test = "Chisq")
## Analysis of Deviance Table
##
## Model 1: T1_gr3_1y ~ rms::rcs(totmet_hpwk, k.pos.totmet_hpwk) + B_sex +
## B_age
## Model 2: T1_gr3_1y ~ B_sex + B_age + totmet_hpwk
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 44 63.846
## 2 45 66.455 -1 -2.6088 0.1063
Make dummy database.
newdf <- tox_cICI
newdf$totmet_hpwk <- c(seq(min(tox_cICI$totmet_hpwk),max(tox_cICI$totmet_hpwk),length=(nrow(newdf)-length(k.pos.totmet_hpwk))), k.pos.totmet_hpwk)
Now we make a model matrix, which is needed in the subsequent steps.
mm <- model.matrix(glm_rcs.adj, data=newdf)
mm[,!colnames(mm) %in% colnames(mm)[grep("totmet_hpwk",colnames(mm))]] <- 0 # set all variables not about totmet_hpwk to 0
For each value of our sequence of totmet_hpwk, we obtain
its predicted coefficient (=eta) and predicted standard
error (=pse). With these, we compute the predicted 95%
conficence intervalls (plo and phi for lower
and upper CI). We store these in a data frame called
est.
est <- data.frame(totmet_hpwk=newdf$totmet_hpwk,eta=mm%*%coef(glm_rcs.adj)) #predicted coefficient
pvar <- diag(mm %*% tcrossprod(vcov(glm_rcs.adj),mm)) #predicted variance
est <- data.frame(est,pse=sqrt(pvar)) #predicted standard error
est <- with(est,
data.frame(est,
plo=eta-qnorm(0.975)*pse, #predicted lower 95%CI bound
phi=eta+qnorm(0.975)*pse)) #predicted upper 95%CI bound
For Cox regression and logistic regression we need to exponentiate the coefficient and confidence bounds to obtain OR and 95% CI.
est[,c("eta","plo","phi")] <- exp(est[,c("eta","plo","phi")])
We remove duplicates (e.g. if we have two of the same values for
totmet_hpwk) and sort on totmet_hpwk.
est <- est[!duplicated(est),] #remove duplicates
est <- est[order(est$totmet_hpwk),] #sort based on var1 order
p_splines_tox.OR_MET_cICI <- ggplot(data = est
)+ geom_ribbon(aes(x=totmet_hpwk, ymin = plo, ymax = phi) #color 95%CI
, colour = rgb(248,136,44,maxColorValue = 255,alpha=100)
, fill=rgb(252,211,178,maxColorValue = 255,alpha=100)
, alpha=0.4
)+ geom_hline(yintercept=1 #add horizontal line at 1
, color="dark grey"
, linetype="dashed"
, linewidth=1
)+ geom_line(aes(x=totmet_hpwk,y=eta) #add estimated HR
, color = "coral3"
, linewidth = 1
)+ geom_point(data=est[est$totmet_hpwk%in%k.pos.totmet_hpwk,] #to plot the places of the knots
, aes(x=totmet_hpwk, y=eta)
)+ geom_rug(data = tox_cICI, aes(x=totmet_hpwk) #add rugs representing real observations for totmet_hpwk
, sides = "b"
, length = grid::unit(0.02, "npc")
, color = "grey"
, alpha = .3
)+ scale_x_continuous(expand = c(0, 0) #set axes directly at 0
)+ scale_y_log10(breaks = unique(c(0.01,seq(0,1,.2)/10,seq(0,1,.2), seq(0,1,.2)*10))
# )+ annotation_logticks(base = 10, sides = "l", outside=F , scaled = T, color = "black",alpha = .5
)+ coord_cartesian(#xlim = c(0, 201) #set x-axis
ylim = c(0.01,10) #set y-axis
# )+ ggtitle("" #add title
)+ xlab("total PA (MET-hours/week)" #add x-axis label
)+ ylab("Odds ratio for severe irAE" #add y-axis label
)+ theme_light( #add/alter theme
)+ theme(plot.margin = margin(0.5,3,0.5,0.5, unit = "char")
)
p_splines_tox.OR_MET_cICI
Alternatively, we can visualize the predicted probabilities given a patient in whom only the variable of interest changes.
newdf <- with(tox_cICI,
expand.grid(totmet_hpwk=c(seq(min(totmet_hpwk),max(totmet_hpwk),length=300), k.pos.totmet_hpwk)
, B_sex = levels(B_sex)[1] #reference level for sex
, B_age = round(median(B_age), digits = 0)
))
newdf$prob <- predict(glm_rcs.adj, newdata = newdf, type = "response")
p_splines_tox.prob_MET_cICI <- ggplot(data = newdf
)+ geom_hline(yintercept=1 #add horizontal line at 1
, color="dark grey"
, linetype="dashed"
, linewidth=1
)+ geom_hline(yintercept=0 #add horizontal line at 0
, color="dark grey"
, linetype="dashed"
, linewidth=1
)+ geom_line(aes(x=totmet_hpwk,y=prob) #add estimated HR
, color = "coral3"
, linewidth = 1
)+ geom_point(data=newdf[newdf$totmet_hpwk%in%k.pos.totmet_hpwk,] #to plot the places of the knots
, aes(x=totmet_hpwk, y=prob)
)+ geom_rug(data = tox_cICI, aes(x=totmet_hpwk) #add rugs representing real observations for totmet_hpwk
, sides = "b"
, length = grid::unit(0.02, "npc")
, color = "grey"
, alpha = .3
)+ scale_x_continuous(expand = c(0, 0) #set axes directly at 0
)+ coord_cartesian(#xlim = c(0, 201) #set x-axis
ylim = c(0.0,1.0) #set y-axis
)+ labs(x = "total PA (MET-hours/week)" #add x-axis label
,y ="Probability of severe irAE" #add y-axis label
,title = "probability of SirAE for a fictitious patient with differing total PA"
,caption = "62 year old male patient treated with cICI"
)+ theme_bw( #add/alter theme
)
p_splines_tox.prob_MET_cICI
Adjusted logistic regression for tertiles variable.
glm_tert.adj <- glm(formula = T1_gr3_1y ~ MZvt_HWK_tertiles + B_sex + B_age,
family = binomial(link = "logit"),
data = tox_cICI)
tbl_regression(glm_tert.adj, exponentiate=T)
| Characteristic | OR1 | 95% CI1 | p-value |
|---|---|---|---|
| MZvt_HWK_tertiles | |||
| [0,1.5] | — | — | |
| (1.5,6.5] | 0.58 | 0.14, 2.26 | 0.4 |
| (6.5,38.5] | 0.31 | 0.06, 1.41 | 0.14 |
| B_sex | |||
| male | — | — | |
| female | 1.34 | 0.37, 4.92 | 0.7 |
| B_age | 1.02 | 0.97, 1.07 | 0.4 |
| 1 OR = Odds Ratio, CI = Confidence Interval | |||
Adjusted logistic regression for continuous linear variable.
glm_lin.adj <- glm(formula = T1_gr3_1y ~ MZvt_HWK + B_sex + B_age,
family = binomial(link = "logit"),
data = tox_cICI)
tbl_regression(glm_lin.adj, exponentiate=T)
| Characteristic | OR1 | 95% CI1 | p-value |
|---|---|---|---|
| MZvt_HWK | 0.95 | 0.85, 1.05 | 0.4 |
| B_sex | |||
| male | — | — | |
| female | 1.48 | 0.42, 5.32 | 0.5 |
| B_age | 1.02 | 0.97, 1.07 | 0.5 |
| 1 OR = Odds Ratio, CI = Confidence Interval | |||
###SirAEs~MVPA-SL logistic regression - assumptions
Assess linearity of MZvt_HWK and B_age with
outcome.
fit <- glm(formula = T1_gr3_1y ~ MZvt_HWK , family = binomial(link = "logit"), data = tox_cICI)
residual_lp_binned(tox_cICI, fit = fit, title = "MZvt_HWK")
fit <- glm(formula = T1_gr3_1y ~ B_age , family = binomial(link = "logit"), data = tox_cICI)
residual_lp_binned(tox_cICI, fit = fit, title = "age")
Assess whether residuals are normally distributed.
plot(glm_lin.adj, which = 2)
Check homoscedasticity.
residual_lp_binned(tox_cICI, glm_lin.adj)
residual_lp_binned(tox_cICI, glm_tert.adj)
Initial regression.
glm_rcs.adj <- glm(formula = T1_gr3_1y ~ rms::rcs(MZvt_HWK, k.pos.MZvt_HWK) + B_sex + B_age,
family = binomial(link = "logit"),
data = tox_cICI)
tbl_regression(glm_rcs.adj, exponentiate=T)
| Characteristic | OR1 | 95% CI1 | p-value |
|---|---|---|---|
| MZvt_HWK | |||
| rms::rcs(MZvt_HWK, k.pos.MZvt_HWK)MZvt_HWK | 0.92 | 0.67, 1.28 | 0.6 |
| rms::rcs(MZvt_HWK, k.pos.MZvt_HWK)MZvt_HWK' | 1.06 | 0.57, 1.89 | 0.8 |
| B_sex | |||
| male | — | — | |
| female | 1.46 | 0.41, 5.28 | 0.6 |
| B_age | 1.02 | 0.97, 1.07 | 0.5 |
| 1 OR = Odds Ratio, CI = Confidence Interval | |||
We can check whether using restricted cubic splines improves the model fit (statistically). A p-value <0.05 indicates that the model fit is statistically different.
We can check whether MZvt_HWK improves the model fit at
all (compare a model with splines for MZvt_HWK (Model
1) to a model without MZvt_HWK (Model
2)).
anova(glm_rcs.adj,update(glm_rcs.adj,.~.-rms::rcs(MZvt_HWK, k.pos.MZvt_HWK)), test = "Chisq")
## Analysis of Deviance Table
##
## Model 1: T1_gr3_1y ~ rms::rcs(MZvt_HWK, k.pos.MZvt_HWK) + B_sex + B_age
## Model 2: T1_gr3_1y ~ B_sex + B_age
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 44 65.557
## 2 46 66.469 -2 -0.91148 0.634
Than, we check whether using splines for MZvt_HWK gives
a better fit than adding it as a linear variable (compare a model with
splines for MZvt_HWK (Model 1) to a model with
MZvt_HWK as is (Model 2)).
anova(glm_rcs.adj,update(glm_rcs.adj,.~.-rms::rcs(MZvt_HWK, k.pos.MZvt_HWK)+MZvt_HWK), test = "Chisq")
## Analysis of Deviance Table
##
## Model 1: T1_gr3_1y ~ rms::rcs(MZvt_HWK, k.pos.MZvt_HWK) + B_sex + B_age
## Model 2: T1_gr3_1y ~ B_sex + B_age + MZvt_HWK
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 44 65.557
## 2 45 65.598 -1 -0.04129 0.839
Make dummy dataset
newdf <- tox_cICI
newdf$MZvt_HWK <- c(seq(min(tox_cICI$MZvt_HWK),max(tox_cICI$MZvt_HWK),length=(nrow(newdf)-length(k.pos.MZvt_HWK))), k.pos.MZvt_HWK)
Now we make a model matrix, which is needed in the subsequent steps.
mm <- model.matrix(glm_rcs.adj, data=newdf)
mm[,!colnames(mm) %in% colnames(mm)[grep("MZvt_HWK",colnames(mm))]] <- 0 # set all variables not about MZvt_HWK to 0
For each value of our sequence of MZvt_HWK, we obtain
its predicted coefficient (=eta) and predicted standard
error (=pse). With these, we compute the predicted 95%
conficence intervalls (plo and phi for lower
and upper CI). We store these in a data frame called
est.
est <- data.frame(MZvt_HWK=newdf$MZvt_HWK,eta=mm%*%coef(glm_rcs.adj)) #predicted coefficient
pvar <- diag(mm %*% tcrossprod(vcov(glm_rcs.adj),mm)) #predicted variance
est <- data.frame(est,pse=sqrt(pvar)) #predicted standard error
est <- with(est,
data.frame(est,
plo=eta-qnorm(0.975)*pse, #predicted lower 95%CI bound
phi=eta+qnorm(0.975)*pse)) #predicted upper 95%CI bound
For Cox regression and logistic regression we need to exponentiate the coefficient and confidence bounds to obtain OR and 95% CI.
est[,c("eta","plo","phi")] <- exp(est[,c("eta","plo","phi")])
We remove duplicates (e.g. if we have two of the same values for
MZvt_HWK) and sort on MZvt_HWK.
est <- est[!duplicated(est),] #remove duplicates
est <- est[order(est$MZvt_HWK),] #sort based on var1 order
p_splines_tox.OR_MVPASL_cICI <- ggplot(data = est
)+ geom_ribbon(aes(x=MZvt_HWK, ymin = plo, ymax = phi) #color 95%CI
, colour = rgb(248,136,44,maxColorValue = 255,alpha=100)
, fill=rgb(252,211,178,maxColorValue = 255,alpha=100)
, alpha=0.4
)+ geom_hline(yintercept=1 #add horizontal line at 1
, color="dark grey"
, linetype="dashed"
, linewidth=1
)+ geom_line(aes(x=MZvt_HWK,y=eta) #add estimated HR
, color = "coral3"
, linewidth = 1
)+ geom_point(data=est[est$MZvt_HWK%in%k.pos.MZvt_HWK,] #to plot the places of the knots
, aes(x=MZvt_HWK, y=eta)
)+ geom_rug(data = tox_cICI, aes(x=MZvt_HWK) #add rugs representing real observations for MZvt_HWK
, sides = "b"
, length = grid::unit(0.02, "npc")
, color = "grey"
, alpha = .3
)+ scale_x_continuous(expand = c(0, 0) #set axes directly at 0
)+ scale_y_log10(breaks = unique(c(0.01,seq(0,1,.2)/10,seq(0,1,.2), seq(0,1,.2)*10))
)+ coord_cartesian(#xlim = c(0, 201) #set x-axis
ylim = c(0.01,10) #set y-axis
# )+ ggtitle("" #add title
)+ xlab("MVPA-SL (hours/week)" #add x-axis label
)+ ylab("Odds ratio for severe irAE" #add y-axis label
)+ theme_light( #add/alter theme
)+ theme(plot.margin = margin(0.5,3,0.5,0.5, unit = "char")
)
p_splines_tox.OR_MVPASL_cICI
Alternatively, we can visualize the predicted probabilities given a patient in whom only the variable of interest changes.
newdf <- with(tox_cICI,
expand.grid(MZvt_HWK=c(seq(min(MZvt_HWK),max(MZvt_HWK),length=300), k.pos.MZvt_HWK)
, B_sex = levels(B_sex)[1] #reference level for sex
, B_age = round(median(B_age), digits = 0)
))
median(tox_cICI$B_age)
## [1] 62
newdf$prob <- predict(glm_rcs.adj, newdata = newdf, type = "response")
p_splines_tox.prob_MVPASL_cICI <- ggplot(data = newdf
)+ geom_hline(yintercept=1 #add horizontal line at 1
, color="dark grey"
, linetype="dashed"
, linewidth=1
)+ geom_hline(yintercept=0 #add horizontal line at 0
, color="dark grey"
, linetype="dashed"
, linewidth=1
)+ geom_line(aes(x=MZvt_HWK,y=prob) #add estimated HR
, color = "coral3"
, linewidth = 1
)+ geom_point(data=newdf[newdf$MZvt_HWK%in%k.pos.MZvt_HWK,] #to plot the places of the knots
, aes(x=MZvt_HWK, y=prob)
)+ geom_rug(data = tox_cICI, aes(x=MZvt_HWK) #add rugs representing real observations for MZvt_HWK
, sides = "b"
, length = grid::unit(0.02, "npc")
, color = "grey"
, alpha = .3
)+ scale_x_continuous(expand = c(0, 0) #set axes directly at 0
)+ coord_cartesian(#xlim = c(0, 201) #set x-axis
ylim = c(0.0,1.0) #set y-axis
)+ labs(x = "MVPA-SL (hours/week)" #add x-axis label
,y ="Probability of severe irAE" #add y-axis label
,title = "probability of SirAE for a fictitious patient with differing MVPA-SL"
,caption = "62 year old male patient treated with cICI"
)+ theme_bw( #add/alter theme
)
p_splines_tox.prob_MVPASL_cICI
Make Supplementary Figure 2.
suppl_figure2 <- ggarrange(p_splines_tox.OR_MET_cICI, p_splines_tox.OR_MVPASL_cICI
,labels = c("A","B")
,ncol=2, nrow=1
,widths = c(1,1), heights = c(1,1)
,common.legend=F
)
suppl_figure2
Save as PDF.
Death might be a competing risk for irAE occurrence. In other words: patients who die early might not have survived long enough to have an irAE although they would have had one if they would have lived longer. The proportional subdistribution hazard model as described by Fine and Gray (FG), infers the rate of occurrence of the event (irAE) in patients who have not yet experienced that specific event, including patients who have experienced the competing event (death). While the Fine and Gray model provides us a Hazard Ratio (HR), we can obtain a graphical representation by plotting the cumulative incidence function.
We will use the data frame data_OS: data from all SQUASH
respondents with survival data.
For competing risk analyses, a variable is needed reflecting all
possible outcomes: - censor =censored; - SirAE
=developed severe irAE; - death =died before developing a
severe irAE (in our case all deaths were caused by disease
progression).
Estimate the cumulative incidence in the context of competing risks
using the cuminc() function from the
tidycmprsk package, which is a wrapper around the
cmprsk package by Gray. We can now obtain the cumulative
incidence at a specific time point. To visualize the cumulative
incidence function, we will use the ggcumic() function from
the ggsurvfit package.
cif_tert.crude_MET <- tidycmprsk::cuminc(Surv(T1_gr3_days, T1_gr3_status) ~ totmet_hpwk_tertiles, data = data_OS)
tidycmprsk::tbl_cuminc(cif_tert.crude_MET,
times = 12*30.4375,
outcomes = c("SirAE" , "death"),
label_header = "1-year cuminc"
) #%>%
| Characteristic | 1-year cuminc |
|---|---|
| SirAE | |
| totmet_hpwk_tertiles | |
| [0,51] | 30% (20%, 40%) |
| (51,101] | 13% (7.1%, 22%) |
| (101,371] | 13% (7.0%, 21%) |
| death | |
| totmet_hpwk_tertiles | |
| [0,51] | 28% (19%, 39%) |
| (51,101] | 12% (5.5%, 21%) |
| (101,371] | 11% (4.9%, 19%) |
# add_p() #to add p-value
cuminc_MET <- ggcuminc(cif_tert.crude_MET,
outcome = c("death", "SirAE")
,theme = list(theme_ggsurvfit_default(),theme(legend.position = "right"))
# ) + add_confidence_interval(
) + add_risktable(risktable_stats = "n.risk"
) + labs(x = "Time (months)"
,color = "total PA"
,linetype = "outcome"
) + coord_cartesian(ylim = c(0,0.5)
)
cuminc_MET
For the subdistribution hazard regression, we will use the
crr() function from the tidycmprsk
package.
crr_tert.adj_MET <- tidycmprsk::crr(Surv(T1_gr3_days, T1_gr3_status) ~ totmet_hpwk_tertiles + B_sex + B_age + B_tumor_type.1 + B_paladj + B_ther_prev + B_ther_cur_regrouped.1,
failcode = "SirAE",
data = data_OS)
tbl_regression(crr_tert.adj_MET, exponentiate=T)
| Characteristic | HR1 | 95% CI1 | p-value |
|---|---|---|---|
| totmet_hpwk_tertiles | |||
| [0,51] | — | — | |
| (51,101] | 0.47 | 0.21, 1.05 | 0.066 |
| (101,371] | 0.43 | 0.21, 0.90 | 0.024 |
| B_sex | |||
| male | — | — | |
| female | 1.30 | 0.65, 2.61 | 0.5 |
| B_age | 1.01 | 0.98, 1.04 | 0.4 |
| B_tumor_type.1 | |||
| melanoma | — | — | |
| NSCLC | 1.34 | 0.28, 6.40 | 0.7 |
| RCC | 0.53 | 0.21, 1.32 | 0.2 |
| other | 0.78 | 0.19, 3.27 | 0.7 |
| B_paladj | |||
| palliative | — | — | |
| adjuvant | 0.60 | 0.15, 2.36 | 0.5 |
| B_ther_prev | |||
| no | — | — | |
| yes | 0.35 | 0.11, 1.14 | 0.081 |
| B_ther_cur_regrouped.1 | |||
| anti-PD-(L)1 | — | — | |
| cICI | 8.03 | 2.74, 23.5 | <0.001 |
| ICI + chemo/targeted therapy | 1.63 | 0.50, 5.32 | 0.4 |
| 1 HR = Hazard Ratio, CI = Confidence Interval | |||
Estimate the cumulative incidence in the context of competing risks
using the cuminc() function from the
tidycmprsk package, which is a wrapper around the
cmprsk package by Gray. We can now obtain the cumulative
incidence at a specific time point. To visualize the cumulative
incidence function, we will use the ggcumic() function from
the ggsurvfit package.
cif_tert.crude_MVPASL <- tidycmprsk::cuminc(Surv(T1_gr3_days, T1_gr3_status) ~ MZvt_HWK_tertiles, data = data_OS)
tidycmprsk::tbl_cuminc(cif_tert.crude_MVPASL,
times = 12*30.4375,
outcomes = c("SirAE" , "death"),
label_header = "1-year cuminc"
) #%>%
| Characteristic | 1-year cuminc |
|---|---|
| SirAE | |
| MZvt_HWK_tertiles | |
| [0,1.5] | 29% (20%, 39%) |
| (1.5,6.5] | 15% (8.1%, 24%) |
| (6.5,38.5] | 11% (5.6%, 20%) |
| death | |
| MZvt_HWK_tertiles | |
| [0,1.5] | 27% (17%, 37%) |
| (1.5,6.5] | 14% (6.9%, 23%) |
| (6.5,38.5] | 9.9% (4.3%, 18%) |
# add_p() #to add p-value
cuminc_MVPASL <- ggcuminc(cif_tert.crude_MVPASL,
outcome = c("death", "SirAE")
,theme = list(theme_ggsurvfit_default(),theme(legend.position = "right"))
# ) + add_confidence_interval(
) + add_risktable(risktable_stats = "n.risk"
) + labs(x = "Time (months)"
,color = "MVPA-SL"
,linetype = "outcome"
) + coord_cartesian(ylim = c(0,0.5)
)
cuminc_MVPASL
For the subdistribution hazard regression, we will use the
crr() function from the tidycmprsk
package.
crr_tert.adj_MVPASL <- tidycmprsk::crr(Surv(T1_gr3_days, T1_gr3_status) ~ MZvt_HWK_tertiles + B_sex + B_age + B_tumor_type.1 + B_paladj + B_ther_prev + B_ther_cur_regrouped.1,
failcode = "SirAE",
data = data_OS)
tbl_regression(crr_tert.adj_MVPASL, exponentiate=T)
| Characteristic | HR1 | 95% CI1 | p-value |
|---|---|---|---|
| MZvt_HWK_tertiles | |||
| [0,1.5] | — | — | |
| (1.5,6.5] | 0.60 | 0.29, 1.27 | 0.2 |
| (6.5,38.5] | 0.38 | 0.17, 0.88 | 0.025 |
| B_sex | |||
| male | — | — | |
| female | 1.13 | 0.58, 2.22 | 0.7 |
| B_age | 1.02 | 0.99, 1.05 | 0.2 |
| B_tumor_type.1 | |||
| melanoma | — | — | |
| NSCLC | 1.33 | 0.25, 7.19 | 0.7 |
| RCC | 0.54 | 0.22, 1.37 | 0.2 |
| other | 0.78 | 0.18, 3.43 | 0.7 |
| B_paladj | |||
| palliative | — | — | |
| adjuvant | 0.60 | 0.14, 2.46 | 0.5 |
| B_ther_prev | |||
| no | — | — | |
| yes | 0.38 | 0.12, 1.25 | 0.11 |
| B_ther_cur_regrouped.1 | |||
| anti-PD-(L)1 | — | — | |
| cICI | 7.69 | 2.52, 23.5 | <0.001 |
| ICI + chemo/targeted therapy | 1.71 | 0.48, 6.08 | 0.4 |
| 1 HR = Hazard Ratio, CI = Confidence Interval | |||
Make Supplementary Figure 3.
suppl_figure3 <- ggarrange(cuminc_MET, cuminc_MVPASL
,labels = c("A","B")
,ncol=2, nrow=1
,widths = c(1,1), heights = c(1,1)
,common.legend=F
)
suppl_figure3
Save as PDF.
Make OS_pal.
OS_pal <- data_OS[data_OS$B_paladj%in%"palliative",]
Specify data frame to use
data <- OS_pal
Relabel variables and add units that you want to use in your table (all formatting to let the table look nicely):
label(data$B_sex) <- "sex"
label(data$B_age) <- "age"
units(data$B_age) <- "years"
label(data$B_PS) <- "ECOG performance status"
label(data$B_tumor_type.1) <- "tumor type"
label(data$B_tumor_stage_dich) <- "stage"
label(data$B_paladj) <- "treatment intent"
label(data$B_ther_prev) <- "previous systemic therapy"
label(data$B_ther_cur_regrouped.1) <- "therapy"
label(data$T1_gr3_1y) <- "severe irAE within 1 year"
label(data$totmet_hpwk_tertiles) <- "total PA"
label(data$totmet_hpwk) <- "total PA"
units(data$totmet_hpwk) <- "hours/week"
label(data$MZvt_HWK) <- "MVPA-SL"
units(data$MZvt_HWK) <- "hours/week"
Make the actual table:
~ B_sex + B_age + ... + B_ther_cur_regrouped.1 defines
the rows;| totmet_hpwk_tertiles defines the columns;table1(~ totmet_hpwk + MZvt_HWK + B_sex + B_age + B_PS + B_tumor_type.1 + B_tumor_stage_dich + B_paladj + B_ther_prev + B_ther_cur_regrouped.1| totmet_hpwk_tertiles,
data = data,
overall = "Overall",
# caption = "Table 1: baseline characteristics of combined ipilimumab",
# footnote = "RCC: renal cell carcinoma",
topclass = "Rtable1-zebra",
render.categorical = my.render.cat,
render.continuous = my.render.cont,#c(.="median [Q1, Q3]"),
render.missing = my.render.missing,
droplevels = TRUE)
| [0,51] (N=62) |
(51,101] (N=46) |
(101,371] (N=42) |
Overall (N=150) |
|
|---|---|---|---|---|
| total PA (hours/week) | ||||
| median [Q1-Q3] | 30.0 [15.0-39.1] | 73.0 [62.2-82.6] | 122.9 [114.5-160.4] | 62.6 [35.2-107.3] |
| mean (SD) | 26.8 (14.4) | 73.1 (13.3) | 143.1 (49.6) | 73.5 (55.6) |
| MVPA-SL (hours/week) | ||||
| median [Q1-Q3] | 0.5 [0.0-2.2] | 5.0 [1.1-8.0] | 7.1 [4.1-13.4] | 3.1 [0.0-7.0] |
| mean (SD) | 1.5 (2.2) | 5.3 (4.7) | 9.4 (6.8) | 4.9 (5.6) |
| sex | ||||
| male | 42 (67.7%) | 31 (67.4%) | 24 (57.1%) | 97 (64.7%) |
| female | 20 (32.3%) | 15 (32.6%) | 18 (42.9%) | 53 (35.3%) |
| age (years) | ||||
| median [Q1-Q3] | 67.0 [59.5-74.8] | 64.5 [54.0-74.0] | 62.5 [58.0-69.8] | 65.5 [57.0-74.0] |
| mean (SD) | 65.4 (13.0) | 62.4 (15.2) | 64.0 (10.3) | 64.1 (13.0) |
| ECOG performance status | ||||
| 0 | 18 (29.0%) | 15 (32.6%) | 22 (52.4%) | 55 (36.7%) |
| 1 | 30 (48.4%) | 29 (63.0%) | 16 (38.1%) | 75 (50.0%) |
| 2 | 11 (17.7%) | 1 (2.2%) | 3 (7.1%) | 15 (10.0%) |
| 3 | 2 (3.2%) | 0 (0.0%) | 0 (0.0%) | 2 (1.3%) |
| 4 | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) |
| unknown | 1 (1.6%) | 1 (2.2%) | 1 (2.4%) | 3 (2.0%) |
| tumor type | ||||
| melanoma | 24 (38.7%) | 19 (41.3%) | 20 (47.6%) | 63 (42.0%) |
| NSCLC | 13 (21.0%) | 7 (15.2%) | 9 (21.4%) | 29 (19.3%) |
| RCC | 10 (16.1%) | 9 (19.6%) | 9 (21.4%) | 28 (18.7%) |
| other | 15 (24.2%) | 11 (23.9%) | 4 (9.5%) | 30 (20.0%) |
| stage | ||||
| III | 4 (6.5%) | 6 (13.0%) | 4 (9.5%) | 14 (9.3%) |
| IV | 57 (91.9%) | 39 (84.8%) | 38 (90.5%) | 134 (89.3%) |
| other | 1 (1.6%) | 1 (2.2%) | 0 (0.0%) | 2 (1.3%) |
| treatment intent | ||||
| palliative | 62 (100.0%) | 46 (100.0%) | 42 (100.0%) | 150 (100.0%) |
| adjuvant | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) |
| previous systemic therapy | ||||
| no | 45 (72.6%) | 37 (80.4%) | 35 (83.3%) | 117 (78.0%) |
| yes | 17 (27.4%) | 9 (19.6%) | 7 (16.7%) | 33 (22.0%) |
| therapy | ||||
| anti-PD-(L)1 | 30 (48.4%) | 23 (50.0%) | 11 (26.2%) | 64 (42.7%) |
| cICI | 21 (33.9%) | 18 (39.1%) | 22 (52.4%) | 61 (40.7%) |
| ICI + chemo/targeted therapy | 11 (17.7%) | 5 (10.9%) | 9 (21.4%) | 25 (16.7%) |
Indeed we selected only palliative patients.
To confirm the results for overall survival we will perform the Cox
proportional hazard regression and KM curves, to asses the correlation
of total MET hours per week (totmet_hpwk_tertiles) and
MVPA-SL (MZvt_HWK_tertiles), in different functional forms
- categorial: tertiles by sample size of overall dataset - continuous:
splines to see whether shape is similar to all patients
We first flip the order of levels of
totmet_hpwk_tertiles, since it appears that patients with
highest values have better survival and thus it is more intuitive.
OS_pal$totmet_hpwk_tertiles_KM <- factor(OS_pal$totmet_hpwk_tertiles
, levels = rev(levels(OS_pal$totmet_hpwk_tertiles)))
fit_tert.crude <- survfit(Surv(OS_days, OS_status)~ totmet_hpwk_tertiles_KM ,
data = OS_pal)
KM_OS_MET_pal <- ggsurvplot(fit_tert.crude
,conf.int = F
,pval = F
,xscale="d_m"
,break.x.by=30.4375*6
# ,title=""
,xlab="Time (months)"
,ylab="Survival probability"
,legend = c(.26,.24)
,legend.title="total PA"
,legend.labs=c(paste0("high ",
levels(OS_pal$totmet_hpwk_tertiles_KM)[1]),
paste0("moderate ",
levels(OS_pal$totmet_hpwk_tertiles_KM)[2]),
paste0("low ",
levels(OS_pal$totmet_hpwk_tertiles_KM)[3]))
,axes.offset=F
,surv.median.line = "none"
,consor=T
,censor.shape=124
,censor.size=2
,risk.table = T
,risk.table.y.text=F
,fontsize=3.5
,ggtheme = theme_survminer(font.x = 12, font.y=12, font.main=12, font.tickslab=10, font.legend=10)
,tables.theme = theme_cleantable(base_size = 10, font.x=10, font.y=10, font.main=12)
)
KM_OS_MET_pal
KM_OS_MET_pal$plot <- KM_OS_MET_pal$plot +
theme(legend.background = element_rect(fill=alpha("white", alpha=0)))
Adjusted Cox regression for tertiles linear variable
cox_tert.adj <- coxph(Surv(OS_days, OS_status)~ totmet_hpwk_tertiles + B_sex + B_age + B_ther_prev + B_ther_cur_regrouped.1,
data = OS_pal)
tbl_regression(cox_tert.adj, exponentiate=T)
| Characteristic | HR1 | 95% CI1 | p-value |
|---|---|---|---|
| totmet_hpwk_tertiles | |||
| [0,51] | — | — | |
| (51,101] | 0.55 | 0.29, 1.04 | 0.066 |
| (101,371] | 0.39 | 0.19, 0.81 | 0.011 |
| B_sex | |||
| male | — | — | |
| female | 1.12 | 0.64, 1.93 | 0.7 |
| B_age | 0.99 | 0.97, 1.01 | 0.4 |
| B_ther_prev | |||
| no | — | — | |
| yes | 1.92 | 1.04, 3.53 | 0.036 |
| B_ther_cur_regrouped.1 | |||
| anti-PD-(L)1 | — | — | |
| cICI | 1.01 | 0.51, 2.03 | >0.9 |
| ICI + chemo/targeted therapy | 1.94 | 1.01, 3.74 | 0.048 |
| 1 HR = Hazard Ratio, CI = Confidence Interval | |||
Run initial regression.
cox_rcs.adj <- coxph(Surv(OS_days, OS_status)~ rms::rcs(totmet_hpwk, k.pos.totmet_hpwk) + B_sex + B_age + B_ther_prev + B_ther_cur_regrouped.1,
data = OS_pal)
tbl_regression(cox_rcs.adj, exponentiate=T)
| Characteristic | HR1 | 95% CI1 | p-value |
|---|---|---|---|
| totmet_hpwk | |||
| rms::rcs(totmet_hpwk, k.pos.totmet_hpwk)totmet_hpwk | 0.98 | 0.97, 0.99 | 0.004 |
| rms::rcs(totmet_hpwk, k.pos.totmet_hpwk)totmet_hpwk' | 1.02 | 1.00, 1.03 | 0.048 |
| B_sex | |||
| male | — | — | |
| female | 1.15 | 0.66, 2.00 | 0.6 |
| B_age | 0.99 | 0.97, 1.01 | 0.3 |
| B_ther_prev | |||
| no | — | — | |
| yes | 2.00 | 1.08, 3.72 | 0.028 |
| B_ther_cur_regrouped.1 | |||
| anti-PD-(L)1 | — | — | |
| cICI | 1.03 | 0.52, 2.06 | >0.9 |
| ICI + chemo/targeted therapy | 2.03 | 1.06, 3.89 | 0.033 |
| 1 HR = Hazard Ratio, CI = Confidence Interval | |||
We can check whether using restricted cubic splines improves the model fit (statistically). A p-value <0.05 indicates that the model fit is statistically different.
We can check whether totmet_hpwk improves the model fit
at all (compare a model with splines for totmet_hpwk
(Model 1) to a model without totmet_hpwk
(Model 2)).
anova(cox_rcs.adj,update(cox_rcs.adj,.~.-rms::rcs(totmet_hpwk, k.pos.totmet_hpwk)))
## Analysis of Deviance Table
## Cox model: response is Surv(OS_days, OS_status)
## Model 1: ~ rms::rcs(totmet_hpwk, k.pos.totmet_hpwk) + B_sex + B_age + B_ther_prev + B_ther_cur_regrouped.1
## Model 2: ~ B_sex + B_age + B_ther_prev + B_ther_cur_regrouped.1
## loglik Chisq Df Pr(>|Chi|)
## 1 -243.22
## 2 -248.33 10.227 2 0.006014 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Than, we check whether using splines for totmet_hpwk
gives a better fit than adding it as a linear variable (compare a model
with splines for totmet_hpwk (Model 1) to a model
with totmet_hpwk as is (Model 2)).
anova(cox_rcs.adj,update(cox_rcs.adj,.~.-rms::rcs(totmet_hpwk, k.pos.totmet_hpwk)+totmet_hpwk))
## Analysis of Deviance Table
## Cox model: response is Surv(OS_days, OS_status)
## Model 1: ~ rms::rcs(totmet_hpwk, k.pos.totmet_hpwk) + B_sex + B_age + B_ther_prev + B_ther_cur_regrouped.1
## Model 2: ~ B_sex + B_age + B_ther_prev + B_ther_cur_regrouped.1 + totmet_hpwk
## loglik Chisq Df Pr(>|Chi|)
## 1 -243.22
## 2 -244.80 3.1687 1 0.07506 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Thus, using restricted cubic splines for totmet_hpwk
does NOT improve our model fit..
Make dummy database.
newdf <- with(OS_pal,
expand.grid(totmet_hpwk=c(seq(min(totmet_hpwk),max(totmet_hpwk),length=300), k.pos.totmet_hpwk)
, B_sex = levels(B_sex)[1] #reference level for sex
, B_age = 0 #reference age for age (note, you could also use e.g. median, this wouldn't change the results)
, B_tumor_type.1 = levels(B_tumor_type.1)[1]
, B_paladj = levels(B_paladj)[1]
, B_ther_prev = levels(B_ther_prev)[1]
, B_ther_cur_regrouped.1 = levels(B_ther_cur_regrouped.1)[1]
))
Now we make a model matrix (= some technical stuff needed for next step).
mm <- model.matrix(cox_rcs.adj,data=newdf)
mm[,!colnames(mm) %in% colnames(mm)[grep("totmet_hpwk",colnames(mm))]] <- 0 # set all variables not about totmet_hpwk to 0
For each value of our sequence of totmet_hpwk, we obtain
its predicted coefficient (=eta) and predicted standard
error (=pse). With these, we compute the predicted 95%
confidence intervals (plo and phi for lower
and upper CI). We store these in a data frame called
est.
est <- data.frame(totmet_hpwk=newdf$totmet_hpwk,eta=mm%*%coef(cox_rcs.adj)) #predicted coefficient
pvar <- diag(mm %*% tcrossprod(vcov(cox_rcs.adj),mm)) #predicted variance
est <- data.frame(est,pse=sqrt(pvar)) #predicted standard error
est <- with(est,
data.frame(est,
plo=eta-qnorm(0.975)*pse, #predicted lower 95%CI bound
phi=eta+qnorm(0.975)*pse)) #predicted upper 95%CI bound
For Cox regression and logistic regression we need to exponentiate the coefficient and confidence bounds to obtain HR and 95% CI.
est[,c("eta","plo","phi")] <- exp(est[,c("eta","plo","phi")])
We remove duplicates (e.g. if we have two of the same values for
totmet_hpwk) and sort on totmet_hpwk.
est <- est[!duplicated(est),] #remove duplicates
est <- est[order(est$totmet_hpwk),] #sort based on var1 order
Now, we can finally make the actual plot.
p_splines_OS_MET_pal <- ggplot(data = est
)+ geom_ribbon(aes(x=totmet_hpwk, ymin = plo, ymax = phi) #color 95%CI
, colour = rgb(248,136,44,maxColorValue = 255,alpha=100)
, fill=rgb(252,211,178,maxColorValue = 255,alpha=100)
, alpha=0.4
)+ geom_hline(yintercept=1 #add horizontal line at 1
, color="dark grey"
, linetype="dashed"
, linewidth=1
)+ geom_line(aes(x=totmet_hpwk,y=eta) #add estimated HR
, color = "coral3"
, linewidth = 1
)+ geom_point(data=est[est$totmet_hpwk%in%k.pos.totmet_hpwk,] #to plot the places of the knots
, aes(x=totmet_hpwk, y=eta)
)+ geom_rug(data = OS_pal, aes(x=totmet_hpwk) #add rugs representing real observations for totmet_hpwk
, sides = "b"
, length = grid::unit(0.02, "npc")
, color = "grey"
, alpha = .3
)+ scale_x_continuous(expand = c(0, 0) #set axes directly at 0
)+ scale_y_log10(breaks = unique(c(0.1,seq(0,1,.2), seq(0,1,.2)*10))
# )+ annotation_logticks(base = 10, sides = "l", outside=F , scaled = T, color = "black",alpha = .5
)+ coord_cartesian(#xlim = c(0, 201) #set x-axis
ylim = c(0.1,10) #set y-axis
# )+ ggtitle("" #add title
)+ xlab("total PA (MET-hours/week)" #add x-axis label
)+ ylab("Hazard ratio for death" #add y-axis label
)+ theme_light( #add/alter theme
)+ theme(plot.margin = margin(0.5,3,0.5,0.5, unit = "char")
)
p_splines_OS_MET_pal
Kaplan Meier curve for tertiles (=unadjusted)
OS_pal$MZvt_HWK_tertiles_KM <- factor(OS_pal$MZvt_HWK_tertiles
, levels = rev(levels(OS_pal$MZvt_HWK_tertiles)))
fit_tert.crude <- survfit(Surv(OS_days, OS_status)~ MZvt_HWK_tertiles_KM ,
data = OS_pal)
KM_OS_MVPASL_pal <- ggsurvplot(fit_tert.crude
,conf.int = F
,pval = F
,xscale="d_m"
,break.x.by=30.4375*6
# ,title=""
,xlab="Time (months)"
,ylab="Survival probability"
,legend = c(.26,.24)
,legend.title="MVPA-SL"
,legend.labs=c(paste0("high ",
levels(OS_pal$MZvt_HWK_tertiles_KM)[1]),
paste0("moderate ",
levels(OS_pal$MZvt_HWK_tertiles_KM)[2]),
paste0("low ",
levels(OS_pal$MZvt_HWK_tertiles_KM)[3]))
,axes.offset=F
,surv.median.line = "none"
,consor=T
,censor.shape=124
,censor.size=2
,risk.table = T
,risk.table.y.text=F
,fontsize=3.5
,ggtheme = theme_survminer(font.x = 12, font.y=12, font.main=12, font.tickslab=10, font.legend=10)
,tables.theme = theme_cleantable(base_size = 10, font.x=10, font.y=10, font.main=12)
)
KM_OS_MVPASL_pal
KM_OS_MVPASL_pal$plot <- KM_OS_MVPASL_pal$plot +
theme(legend.background = element_rect(fill=alpha("white", alpha=0)))
Adjusted Cox regression for tertiles linear variable
cox_tert.adj <- coxph(Surv(OS_days, OS_status)~ MZvt_HWK_tertiles + B_sex + B_age + B_ther_prev + B_ther_cur_regrouped.1,
data = OS_pal)
tbl_regression(cox_tert.adj, exponentiate=T)
| Characteristic | HR1 | 95% CI1 | p-value |
|---|---|---|---|
| MZvt_HWK_tertiles | |||
| [0,1.5] | — | — | |
| (1.5,6.5] | 0.80 | 0.45, 1.42 | 0.4 |
| (6.5,38.5] | 0.37 | 0.16, 0.87 | 0.022 |
| B_sex | |||
| male | — | — | |
| female | 0.97 | 0.56, 1.68 | >0.9 |
| B_age | 1.00 | 0.98, 1.02 | 0.8 |
| B_ther_prev | |||
| no | — | — | |
| yes | 2.02 | 1.10, 3.73 | 0.024 |
| B_ther_cur_regrouped.1 | |||
| anti-PD-(L)1 | — | — | |
| cICI | 1.00 | 0.51, 1.99 | >0.9 |
| ICI + chemo/targeted therapy | 1.88 | 0.98, 3.58 | 0.056 |
| 1 HR = Hazard Ratio, CI = Confidence Interval | |||
Initial regression.
cox_rcs.adj <- coxph(Surv(OS_days, OS_status)~ rms::rcs(MZvt_HWK, k.pos.MZvt_HWK) + B_sex + B_age + B_ther_prev + B_ther_cur_regrouped.1,
data = OS_pal)
tbl_regression(cox_rcs.adj, exponentiate = T)
| Characteristic | HR1 | 95% CI1 | p-value |
|---|---|---|---|
| MZvt_HWK | |||
| rms::rcs(MZvt_HWK, k.pos.MZvt_HWK)MZvt_HWK | 0.87 | 0.75, 1.02 | 0.092 |
| rms::rcs(MZvt_HWK, k.pos.MZvt_HWK)MZvt_HWK' | 1.11 | 0.78, 1.59 | 0.5 |
| B_sex | |||
| male | — | — | |
| female | 0.95 | 0.55, 1.65 | 0.9 |
| B_age | 1.00 | 0.97, 1.02 | 0.7 |
| B_ther_prev | |||
| no | — | — | |
| yes | 1.97 | 1.07, 3.61 | 0.029 |
| B_ther_cur_regrouped.1 | |||
| anti-PD-(L)1 | — | — | |
| cICI | 0.97 | 0.49, 1.91 | >0.9 |
| ICI + chemo/targeted therapy | 1.82 | 0.96, 3.47 | 0.068 |
| 1 HR = Hazard Ratio, CI = Confidence Interval | |||
We can check whether using restricted cubic splines improves the model fit (statistically). A p-value <0.05 indicates that the model fit is statistically different.
We can check whether MZvt_HWK improves the model fit at
all (compare a model with splines for MZvt_HWK (Model
1) to a model without MZvt_HWK (Model
2)).
anova(cox_rcs.adj,update(cox_rcs.adj,.~.-rms::rcs(MZvt_HWK, k.pos.MZvt_HWK)))
## Analysis of Deviance Table
## Cox model: response is Surv(OS_days, OS_status)
## Model 1: ~ rms::rcs(MZvt_HWK, k.pos.MZvt_HWK) + B_sex + B_age + B_ther_prev + B_ther_cur_regrouped.1
## Model 2: ~ B_sex + B_age + B_ther_prev + B_ther_cur_regrouped.1
## loglik Chisq Df Pr(>|Chi|)
## 1 -244.09
## 2 -248.33 8.4855 2 0.01437 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Than, we check whether using splines for MZvt_HWK gives
a better fit than adding it as a linear variable (compare a model with
splines for MZvt_HWK (Model 1) to a model with
MZvt_HWK as is (Model 2)).
anova(cox_rcs.adj,update(cox_rcs.adj,.~.-rms::rcs(MZvt_HWK, k.pos.MZvt_HWK)+MZvt_HWK))
## Analysis of Deviance Table
## Cox model: response is Surv(OS_days, OS_status)
## Model 1: ~ rms::rcs(MZvt_HWK, k.pos.MZvt_HWK) + B_sex + B_age + B_ther_prev + B_ther_cur_regrouped.1
## Model 2: ~ B_sex + B_age + B_ther_prev + B_ther_cur_regrouped.1 + MZvt_HWK
## loglik Chisq Df Pr(>|Chi|)
## 1 -244.09
## 2 -244.26 0.3365 1 0.5618
Make dummy dataset.
newdf <- with(OS_pal,
expand.grid(MZvt_HWK=c(seq(min(MZvt_HWK),max(MZvt_HWK),length=200), k.pos.MZvt_HWK)
, B_sex = levels(B_sex)[1] #reference level for sex
, B_age = 0 #reference age for age (note, you could also use e.g. median, this wouldn't change the results)
, B_paladj = levels(B_paladj)[1]
, B_ther_prev = levels(B_ther_prev)[1]
, B_ther_cur_regrouped.1 = levels(B_ther_cur_regrouped.1)[1]
))
Now we make a model matrix (= some technical stuff needed for next step).
mm <- model.matrix(cox_rcs.adj,data=newdf)
mm[,!colnames(mm) %in% colnames(mm)[grep("MZvt_HWK",colnames(mm))]] <- 0 # set all variables not about MZvt_HWK to 0
For each value of our sequence of MZvt_HWK, we obtain
its predicted coefficient (=eta) and predicted standard
error (=pse). With these, we compute the predicted 95%
conficence intervalls (plo and phi for lower
and upper CI). We store these in a data frame called
est.
est <- data.frame(MZvt_HWK=newdf$MZvt_HWK,eta=mm%*%coef(cox_rcs.adj)) #predicted coefficient
pvar <- diag(mm %*% tcrossprod(vcov(cox_rcs.adj),mm)) #predicted variance
est <- data.frame(est,pse=sqrt(pvar)) #predicted standard error
est <- with(est,
data.frame(est,
plo=eta-qnorm(0.975)*pse, #predicted lower 95%CI bound
phi=eta+qnorm(0.975)*pse)) #predicted upper 95%CI bound
For Cox regression and logistic regression we need to exponentiate the coefficient and confidence bounds to obtain HR and 95% CI.
est[,c("eta","plo","phi")] <- exp(est[,c("eta","plo","phi")])
We remove duplicates (e.g. if we have two of the same values for
MZvt_HWK) and sort on MZvt_HWK.
est <- est[!duplicated(est),] #remove duplicates
est <- est[order(est$MZvt_HWK),] #sort based on var1 order
p_splines_OS_MVPASL_pal <- ggplot(data = est
)+ geom_ribbon(aes(x=MZvt_HWK, ymin = plo, ymax = phi) #color 95%CI
, colour = rgb(248,136,44,maxColorValue = 255,alpha=100)
, fill=rgb(252,211,178,maxColorValue = 255,alpha=100)
, alpha=0.4
)+ geom_hline(yintercept=1 #add horizontal line at 1
, color="dark grey"
, linetype="dashed"
, linewidth=1
)+ geom_line(aes(x=MZvt_HWK,y=eta) #add estimated HR
, color = "coral3"
, linewidth = 1
)+ geom_point(data=est[est$MZvt_HWK%in%k.pos.MZvt_HWK,] #to plot the places of the knots
, aes(x=MZvt_HWK, y=eta)
)+ geom_rug(data = OS_pal, aes(x=MZvt_HWK) #add rugs representing real observations for MZvt_HWK
, sides = "b"
, length = grid::unit(0.02, "npc")
, color = "grey"
, alpha = .5
)+ scale_x_continuous(expand = c(0,0) #set axes directly at 0
)+ scale_y_log10(breaks = unique(c(0.1,seq(0,1,.2), seq(0,1,.2)*10))
# )+ annotation_logticks(base = 10, sides = "l", outside=F , scaled = T, color = "black",alpha = .5
)+ coord_cartesian(#xlim = c(0, 30) #set x-axis
ylim = c(0.1,10) #set y-axis
# )+ ggtitle("" #add title
)+ xlab("MVPA-SL (hours/week)" #add x-axis label
)+ ylab("Hazard ratio for death" #add y-axis label
)+ theme_light( #add/alter theme
)+ theme(plot.margin = margin(0.5,3,0.5,0.5, unit = "char")
)
p_splines_OS_MVPASL_pal
Make Supplementary Figure 5.
KM_OS_MET_pal <- ggarrange(KM_OS_MET_pal$plot,KM_OS_MET_pal$table
, ncol=1, nrow=2
, heights= c(.75,.25))
KM_OS_MVPASL_pal <- ggarrange(KM_OS_MVPASL_pal$plot,KM_OS_MVPASL_pal$table
, ncol=1, nrow=2
, heights= c(.75,.25))
suppl_figure4 <- ggarrange(KM_OS_MET_pal, p_splines_OS_MET_pal
,KM_OS_MVPASL_pal, p_splines_OS_MVPASL_pal
,labels = c("A","B","C","D")
,ncol=2, nrow=2
,widths = c(1,1), heights = c(1,1)
,common.legend=F
)
suppl_figure4
Save as PDF.
Make OS_melanoma.
summary(data_OS$B_lab_LDH)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 94.0 173.8 196.0 218.8 226.5 1124.0 3
table(data_OS$B_tumor_type.1, useNA = "always")
##
## melanoma NSCLC RCC other <NA>
## 153 38 28 32 0
OS_melanoma <- data_OS[data_OS$B_tumor_type.1 %in% "melanoma",]
summary(OS_melanoma$B_tumor_type.1)
## melanoma NSCLC RCC other
## 153 0 0 0
table(OS_melanoma$B_ther_cur_regrouped.1, useNA = "always")
##
## anti-PD-(L)1 cICI
## 119 34
## ICI + chemo/targeted therapy <NA>
## 0 0
OS_melanoma$B_ther_cur_regrouped.1 <- droplevels(OS_melanoma$B_ther_cur_regrouped.1)
OS_melanoma[is.na(OS_melanoma$B_lab_LDH),c("record_id","B_lab_LDH")]
## [1] record_id B_lab_LDH
## <0 rows> (or 0-length row.names)
hist(OS_melanoma$B_lab_LDH)
# OS_melanoma[OS_melanoma$B_lab_LDH=="1124",]$record_id #checked --> correct
We checked the outlier with LDH of 1124U/L in electronic patient files and it is correct.
Specify data frame to use
OS_melanoma$B_lab_LDH_ULN<- cut(OS_melanoma$B_lab_LDH, breaks = c(0, 250, 500, (max(OS_melanoma$B_lab_LDH)+1)),
labels = c("<1x ULN", "1-2x ULN", ">2x ULN"))
table(OS_melanoma$B_lab_LDH_ULN, useNA = "always")
##
## <1x ULN 1-2x ULN >2x ULN <NA>
## 136 14 3 0
data <- OS_melanoma
Relabel variables and add units that you want to use in your table (all formatting to let the table look nicely):
label(data$B_sex) <- "sex"
label(data$B_age) <- "age"
units(data$B_age) <- "years"
label(data$B_PS) <- "ECOG performance status"
label(data$B_tumor_type.1) <- "tumor type"
label(data$B_tumor_stage_dich) <- "stage"
label(data$B_paladj) <- "treatment intent"
label(data$B_ther_prev) <- "previous systemic therapy"
label(data$B_ther_cur_regrouped.1) <- "therapy"
label(data$T1_gr3_1y) <- "severe irAE within 1 year"
label(data$B_lab_LDH) <- "lactate dehydrogenase"
label (data$B_lab_LDH_ULN) <- "lactate dehydrogenase"
label(data$totmet_hpwk_tertiles) <- "MET-hours per week"
label(data$totmet_hpwk) <- "MET-hours"
units(data$totmet_hpwk) <- "hours/week"
label(data$MZvt_HWK) <- "MVPA-SL"
units(data$MZvt_HWK) <- "hours/week"
Make the actual table:
~ B_sex + B_age + ... + B_ther_cur_regrouped.1 defines
the rows;| totmet_hpwk_tertiles defines the columns;table1(~ totmet_hpwk + MZvt_HWK + B_sex + B_age + B_PS + B_tumor_type.1 + B_tumor_stage_dich + B_lab_LDH_ULN + B_paladj + B_ther_prev + B_ther_cur_regrouped.1| totmet_hpwk_tertiles,
data = data,
overall = "Overall",
# caption = "Table 1: baseline characteristics of combined ipilimumab",
# footnote = "RCC: renal cell carcinoma",
topclass = "Rtable1-zebra",
render.categorical = my.render.cat,
render.continuous = my.render.cont,#c(.="median [Q1, Q3]"),
render.missing = my.render.missing,
droplevels = TRUE)
| [0,51] (N=40) |
(51,101] (N=54) |
(101,371] (N=59) |
Overall (N=153) |
|
|---|---|---|---|---|
| MET-hours (hours/week) | ||||
| median [Q1-Q3] | 38.2 [28.0-41.7] | 75.4 [62.0-87.0] | 132.3 [112.0-171.7] | 83.8 [47.3-119.8] |
| mean (SD) | 32.6 (12.9) | 75.4 (13.8) | 151.8 (56.1) | 93.7 (61.0) |
| MVPA-SL (hours/week) | ||||
| median [Q1-Q3] | 1.0 [0.0-3.5] | 4.6 [1.5-6.5] | 10.0 [5.0-17.7] | 4.7 [1.0-9.5] |
| mean (SD) | 1.8 (2.3) | 4.9 (4.1) | 11.3 (8.4) | 6.5 (7.1) |
| sex | ||||
| male | 20 (50.0%) | 40 (74.1%) | 37 (62.7%) | 97 (63.4%) |
| female | 20 (50.0%) | 14 (25.9%) | 22 (37.3%) | 56 (36.6%) |
| age (years) | ||||
| median [Q1-Q3] | 67.5 [59.8-72.3] | 64.0 [52.5-73.0] | 62.0 [54.5-71.5] | 64.0 [54.0-73.0] |
| mean (SD) | 63.1 (14.9) | 62.2 (13.0) | 61.5 (12.2) | 62.2 (13.2) |
| ECOG performance status | ||||
| 0 | 18 (45.0%) | 30 (55.6%) | 42 (71.2%) | 90 (58.8%) |
| 1 | 20 (50.0%) | 21 (38.9%) | 13 (22.0%) | 54 (35.3%) |
| 2 | 2 (5.0%) | 1 (1.9%) | 2 (3.4%) | 5 (3.3%) |
| 3 | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) |
| 4 | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) |
| unknown | 0 (0.0%) | 2 (3.7%) | 2 (3.4%) | 4 (2.6%) |
| tumor type | ||||
| melanoma | 40 (100.0%) | 54 (100.0%) | 59 (100.0%) | 153 (100.0%) |
| NSCLC | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) |
| RCC | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) |
| other | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) |
| stage | ||||
| III | 17 (42.5%) | 33 (61.1%) | 38 (64.4%) | 88 (57.5%) |
| IV | 22 (55.0%) | 21 (38.9%) | 21 (35.6%) | 64 (41.8%) |
| other | 1 (2.5%) | 0 (0.0%) | 0 (0.0%) | 1 (0.7%) |
| lactate dehydrogenase | ||||
| <1x ULN | 31 (77.5%) | 53 (98.1%) | 52 (88.1%) | 136 (88.9%) |
| 1-2x ULN | 8 (20.0%) | 1 (1.9%) | 5 (8.5%) | 14 (9.2%) |
| >2x ULN | 1 (2.5%) | 0 (0.0%) | 2 (3.4%) | 3 (2.0%) |
| treatment intent | ||||
| palliative | 24 (60.0%) | 19 (35.2%) | 20 (33.9%) | 63 (41.2%) |
| adjuvant | 16 (40.0%) | 35 (64.8%) | 39 (66.1%) | 90 (58.8%) |
| previous systemic therapy | ||||
| no | 37 (92.5%) | 52 (96.3%) | 55 (93.2%) | 144 (94.1%) |
| yes | 3 (7.5%) | 2 (3.7%) | 4 (6.8%) | 9 (5.9%) |
| therapy | ||||
| anti-PD-(L)1 | 28 (70.0%) | 45 (83.3%) | 46 (78.0%) | 119 (77.8%) |
| cICI | 12 (30.0%) | 9 (16.7%) | 13 (22.0%) | 34 (22.2%) |
Indeed we selected only melanoma patients.
To confirm the results for overall survival we will perform the Cox
proportional hazard regression and KM curves, to asses the correlation
of total MET hours per week (totmet_hpwk_tertiles) and
MVPA-SL (MZvt_HWK_tertiles), in different functional forms
- categorial: tertiles by sample size of overall dataset - continuous:
splines to see whether shape is similar to all patients
We first flip the order of levels of
totmet_hpwk_tertiles, since it appears that patients with
highest values have better survival and thus it is more intuitive.
OS_melanoma$totmet_hpwk_tertiles_KM <- factor(OS_melanoma$totmet_hpwk_tertiles
, levels = rev(levels(OS_melanoma$totmet_hpwk_tertiles)))
fit_tert.crude <- survfit(Surv(OS_days, OS_status)~ totmet_hpwk_tertiles_KM ,
data = OS_melanoma)
KM_OS_MET_mel <- ggsurvplot(fit_tert.crude
,conf.int = F
,pval = F
,xscale="d_m"
,break.x.by=30.4375*6
# ,title=""
,xlab="Time (months)"
,ylab="Survival probability"
,legend = c(.26,.24)
,legend.title="total PA"
,legend.labs=c(paste0("high ",
levels(OS_melanoma$totmet_hpwk_tertiles_KM)[1]),
paste0("moderate ",
levels(OS_melanoma$totmet_hpwk_tertiles_KM)[2]),
paste0("low ",
levels(OS_melanoma$totmet_hpwk_tertiles_KM)[3]))
,axes.offset=F
,surv.median.line = "none"
,consor=T
,censor.shape=124
,censor.size=2
,risk.table = T
,risk.table.y.text=F
,fontsize=3.5
,ggtheme = theme_survminer(font.x = 12, font.y=12, font.main=12, font.tickslab=10, font.legend=10)
,tables.theme = theme_cleantable(base_size = 10, font.x=10, font.y=10, font.main=12)
)
KM_OS_MET_mel
KM_OS_MET_mel$plot <- KM_OS_MET_mel$plot +
theme(legend.background = element_rect(fill=alpha("white", alpha=0)))
Adjusted Cox regression for tertiles linear variable
cox_tert.adj <- coxph(Surv(OS_days, OS_status)~ totmet_hpwk_tertiles + B_sex + B_age + B_paladj + B_ther_prev + B_ther_cur_regrouped.1 + B_lab_LDH,
data = OS_melanoma)
tbl_regression(cox_tert.adj, exponentiate=T)
| Characteristic | HR1 | 95% CI1 | p-value |
|---|---|---|---|
| totmet_hpwk_tertiles | |||
| [0,51] | — | — | |
| (51,101] | 0.79 | 0.31, 2.03 | 0.6 |
| (101,371] | 0.56 | 0.21, 1.50 | 0.2 |
| B_sex | |||
| male | — | — | |
| female | 0.95 | 0.41, 2.17 | 0.9 |
| B_age | 0.99 | 0.97, 1.02 | 0.5 |
| B_paladj | |||
| palliative | — | — | |
| adjuvant | 0.75 | 0.25, 2.25 | 0.6 |
| B_ther_prev | |||
| no | — | — | |
| yes | 5.28 | 1.54, 18.1 | 0.008 |
| B_ther_cur_regrouped.1 | |||
| anti-PD-(L)1 | — | — | |
| cICI | 1.59 | 0.46, 5.56 | 0.5 |
| B_lab_LDH | 1.00 | 1.00, 1.00 | 0.7 |
| 1 HR = Hazard Ratio, CI = Confidence Interval | |||
Run initial regression.
cox_rcs.adj <- coxph(Surv(OS_days, OS_status)~ rms::rcs(totmet_hpwk, k.pos.totmet_hpwk) + B_sex + B_age + B_paladj + B_ther_prev + B_ther_cur_regrouped.1 + B_lab_LDH,
data = OS_melanoma)
tbl_regression(cox_rcs.adj, exponentiate=T)
| Characteristic | HR1 | 95% CI1 | p-value |
|---|---|---|---|
| totmet_hpwk | |||
| rms::rcs(totmet_hpwk, k.pos.totmet_hpwk)totmet_hpwk | 0.98 | 0.96, 1.00 | 0.10 |
| rms::rcs(totmet_hpwk, k.pos.totmet_hpwk)totmet_hpwk' | 1.02 | 1.00, 1.04 | 0.13 |
| B_sex | |||
| male | — | — | |
| female | 0.91 | 0.40, 2.09 | 0.8 |
| B_age | 0.99 | 0.96, 1.01 | 0.4 |
| B_paladj | |||
| palliative | — | — | |
| adjuvant | 0.70 | 0.24, 2.09 | 0.5 |
| B_ther_prev | |||
| no | — | — | |
| yes | 5.03 | 1.48, 17.1 | 0.010 |
| B_ther_cur_regrouped.1 | |||
| anti-PD-(L)1 | — | — | |
| cICI | 1.44 | 0.41, 5.08 | 0.6 |
| B_lab_LDH | 1.00 | 1.00, 1.00 | 0.7 |
| 1 HR = Hazard Ratio, CI = Confidence Interval | |||
We can check whether using restricted cubic splines improves the model fit (statistically). A p-value <0.05 indicates that the model fit is statistically different.
We can check whether totmet_hpwk improves the model fit
at all (compare a model with splines for totmet_hpwk
(Model 1) to a model without totmet_hpwk
(Model 2)).
anova(cox_rcs.adj,update(cox_rcs.adj,.~.-rms::rcs(totmet_hpwk, k.pos.totmet_hpwk)))
## Analysis of Deviance Table
## Cox model: response is Surv(OS_days, OS_status)
## Model 1: ~ rms::rcs(totmet_hpwk, k.pos.totmet_hpwk) + B_sex + B_age + B_paladj + B_ther_prev + B_ther_cur_regrouped.1 + B_lab_LDH
## Model 2: ~ B_sex + B_age + B_paladj + B_ther_prev + B_ther_cur_regrouped.1 + B_lab_LDH
## loglik Chisq Df Pr(>|Chi|)
## 1 -106.68
## 2 -107.93 2.5076 2 0.2854
Than, we check whether using splines for totmet_hpwk
gives a better fit than adding it as a linear variable (compare a model
with splines for totmet_hpwk (Model 1) to a model
with totmet_hpwk as is (Model 2)).
anova(cox_rcs.adj,update(cox_rcs.adj,.~.-rms::rcs(totmet_hpwk, k.pos.totmet_hpwk)+totmet_hpwk))
## Analysis of Deviance Table
## Cox model: response is Surv(OS_days, OS_status)
## Model 1: ~ rms::rcs(totmet_hpwk, k.pos.totmet_hpwk) + B_sex + B_age + B_paladj + B_ther_prev + B_ther_cur_regrouped.1 + B_lab_LDH
## Model 2: ~ B_sex + B_age + B_paladj + B_ther_prev + B_ther_cur_regrouped.1 + B_lab_LDH + totmet_hpwk
## loglik Chisq Df Pr(>|Chi|)
## 1 -106.68
## 2 -107.72 2.0867 1 0.1486
Thus, using restricted cubic splines for totmet_hpwk
does NOT improve our model fit..
Make dummy database.
newdf <- with(OS_melanoma,
expand.grid(totmet_hpwk=c(seq(min(totmet_hpwk),max(totmet_hpwk),length=300), k.pos.totmet_hpwk)
, B_sex = levels(B_sex)[1] #reference level for sex
, B_age = 0 #reference age for age (note, you could also use e.g. median, this wouldn't change the results)
, B_tumor_type.1 = levels(B_tumor_type.1)[1]
, B_paladj = levels(B_paladj)[1]
, B_ther_prev = levels(B_ther_prev)[1]
, B_ther_cur_regrouped.1 = levels(B_ther_cur_regrouped.1)[1]
, B_lab_LDH = 0
))
Now we make a model matrix (= some technical stuff needed for next step).
mm <- model.matrix(cox_rcs.adj,data=newdf)
mm[,!colnames(mm) %in% colnames(mm)[grep("totmet_hpwk",colnames(mm))]] <- 0 # set all variables not about totmet_hpwk to 0
For each value of our sequence of totmet_hpwk, we obtain
its predicted coefficient (=eta) and predicted standard
error (=pse). With these, we compute the predicted 95%
confidence intervals (plo and phi for lower
and upper CI). We store these in a data frame called
est.
est <- data.frame(totmet_hpwk=newdf$totmet_hpwk,eta=mm%*%coef(cox_rcs.adj)) #predicted coefficient
pvar <- diag(mm %*% tcrossprod(vcov(cox_rcs.adj),mm)) #predicted variance
est <- data.frame(est,pse=sqrt(pvar)) #predicted standard error
est <- with(est,
data.frame(est,
plo=eta-qnorm(0.975)*pse, #predicted lower 95%CI bound
phi=eta+qnorm(0.975)*pse)) #predicted upper 95%CI bound
For Cox regression and logistic regression we need to exponentiate the coefficient and confidence bounds to obtain HR and 95% CI.
est[,c("eta","plo","phi")] <- exp(est[,c("eta","plo","phi")])
We remove duplicates (e.g. if we have two of the same values for
totmet_hpwk) and sort on totmet_hpwk.
est <- est[!duplicated(est),] #remove duplicates
est <- est[order(est$totmet_hpwk),] #sort based on var1 order
Now, we can finally make the actual plot.
p_splines_OS_MET_mel <- ggplot(data = est
)+ geom_ribbon(aes(x=totmet_hpwk, ymin = plo, ymax = phi) #color 95%CI
, colour = rgb(248,136,44,maxColorValue = 255,alpha=100)
, fill=rgb(252,211,178,maxColorValue = 255,alpha=100)
, alpha=0.4
)+ geom_hline(yintercept=1 #add horizontal line at 1
, color="dark grey"
, linetype="dashed"
, linewidth=1
)+ geom_line(aes(x=totmet_hpwk,y=eta) #add estimated HR
, color = "coral3"
, linewidth = 1
)+ geom_point(data=est[est$totmet_hpwk%in%k.pos.totmet_hpwk,] #to plot the places of the knots
, aes(x=totmet_hpwk, y=eta)
)+ geom_rug(data = OS_melanoma, aes(x=totmet_hpwk) #add rugs representing real observations for totmet_hpwk
, sides = "b"
, length = grid::unit(0.02, "npc")
, color = "grey"
, alpha = .3
)+ scale_x_continuous(expand = c(0, 0) #set axes directly at 0
)+ scale_y_log10(breaks = unique(c(0.1,seq(0,1,.2), seq(0,1,.2)*10))
# )+ annotation_logticks(base = 10, sides = "l", outside=F , scaled = T, color = "black",alpha = .5
)+ coord_cartesian(#xlim = c(0, 201) #set x-axis
ylim = c(0.1,10) #set y-axis
# )+ ggtitle("" #add title
)+ xlab("total PA (MET-hours/week)" #add x-axis label
)+ ylab("Hazard ratio for death" #add y-axis label
)+ theme_light( #add/alter theme
)+ theme(plot.margin = margin(0.5,3,0.5,0.5, unit = "char")
)
p_splines_OS_MET_mel
Kaplan Meier curve for tertiles (=unadjusted)
OS_melanoma$MZvt_HWK_tertiles_KM <- factor(OS_melanoma$MZvt_HWK_tertiles
, levels = rev(levels(OS_melanoma$MZvt_HWK_tertiles)))
fit_tert.crude <- survfit(Surv(OS_days, OS_status)~ MZvt_HWK_tertiles_KM ,
data = OS_melanoma)
KM_OS_MVPASL_mel <- ggsurvplot(fit_tert.crude
,conf.int = F
,pval = F
,xscale="d_m"
,break.x.by=30.4375*6
# ,title=""
,xlab="Time (months)"
,ylab="Survival probability"
,legend = c(.26,.24)
,legend.title="MVPA-SL"
,legend.labs=c(paste0("high ",
levels(OS_melanoma$MZvt_HWK_tertiles_KM)[1]),
paste0("moderate ",
levels(OS_melanoma$MZvt_HWK_tertiles_KM)[2]),
paste0("low ",
levels(OS_melanoma$MZvt_HWK_tertiles_KM)[3]))
,axes.offset=F
,surv.median.line = "none"
,consor=T
,censor.shape=124
,censor.size=2
,risk.table = T
,risk.table.y.text=F
,fontsize=3.5
,ggtheme = theme_survminer(font.x = 12, font.y=12, font.main=12, font.tickslab=10, font.legend=10)
,tables.theme = theme_cleantable(base_size = 10, font.x=10, font.y=10, font.main=12)
)
KM_OS_MVPASL_mel
KM_OS_MVPASL_mel$plot <- KM_OS_MVPASL_mel$plot +
theme(legend.background = element_rect(fill=alpha("white", alpha=0)))
Adjusted Cox regression for tertiles linear variable
cox_tert.adj <- coxph(Surv(OS_days, OS_status)~ MZvt_HWK_tertiles + B_sex + B_age + B_paladj + B_ther_prev + B_ther_cur_regrouped.1 + B_lab_LDH,
data = OS_melanoma)
tbl_regression(cox_tert.adj, exponentiate=T)
| Characteristic | HR1 | 95% CI1 | p-value |
|---|---|---|---|
| MZvt_HWK_tertiles | |||
| [0,1.5] | — | — | |
| (1.5,6.5] | 0.48 | 0.18, 1.29 | 0.14 |
| (6.5,38.5] | 0.50 | 0.18, 1.41 | 0.2 |
| B_sex | |||
| male | — | — | |
| female | 0.76 | 0.31, 1.88 | 0.6 |
| B_age | 0.99 | 0.96, 1.02 | 0.5 |
| B_paladj | |||
| palliative | — | — | |
| adjuvant | 0.68 | 0.23, 2.04 | 0.5 |
| B_ther_prev | |||
| no | — | — | |
| yes | 4.86 | 1.36, 17.4 | 0.015 |
| B_ther_cur_regrouped.1 | |||
| anti-PD-(L)1 | — | — | |
| cICI | 1.41 | 0.41, 4.88 | 0.6 |
| B_lab_LDH | 1.00 | 1.00, 1.00 | 0.7 |
| 1 HR = Hazard Ratio, CI = Confidence Interval | |||
Initial regression.
cox_rcs.adj <- coxph(Surv(OS_days, OS_status)~ rms::rcs(MZvt_HWK, k.pos.MZvt_HWK) + B_sex + B_age + B_paladj + B_ther_prev + B_ther_cur_regrouped.1 + B_lab_LDH,
data = OS_melanoma)
tbl_regression(cox_rcs.adj, exponentiate = T)
| Characteristic | HR1 | 95% CI1 | p-value |
|---|---|---|---|
| MZvt_HWK | |||
| rms::rcs(MZvt_HWK, k.pos.MZvt_HWK)MZvt_HWK | 0.76 | 0.60, 0.96 | 0.020 |
| rms::rcs(MZvt_HWK, k.pos.MZvt_HWK)MZvt_HWK' | 1.55 | 1.06, 2.26 | 0.023 |
| B_sex | |||
| male | — | — | |
| female | 0.80 | 0.33, 1.97 | 0.6 |
| B_age | 0.99 | 0.96, 1.02 | 0.5 |
| B_paladj | |||
| palliative | — | — | |
| adjuvant | 0.61 | 0.21, 1.75 | 0.4 |
| B_ther_prev | |||
| no | — | — | |
| yes | 4.83 | 1.45, 16.1 | 0.010 |
| B_ther_cur_regrouped.1 | |||
| anti-PD-(L)1 | — | — | |
| cICI | 1.29 | 0.38, 4.33 | 0.7 |
| B_lab_LDH | 1.00 | 1.00, 1.00 | 0.8 |
| 1 HR = Hazard Ratio, CI = Confidence Interval | |||
We can check whether using restricted cubic splines improves the model fit (statistically). A p-value <0.05 indicates that the model fit is statistically different.
We can check whether MZvt_HWK improves the model fit at
all (compare a model with splines for MZvt_HWK (Model
1) to a model without MZvt_HWK (Model
2)).
anova(cox_rcs.adj,update(cox_rcs.adj,.~.-rms::rcs(MZvt_HWK, k.pos.MZvt_HWK)))
## Analysis of Deviance Table
## Cox model: response is Surv(OS_days, OS_status)
## Model 1: ~ rms::rcs(MZvt_HWK, k.pos.MZvt_HWK) + B_sex + B_age + B_paladj + B_ther_prev + B_ther_cur_regrouped.1 + B_lab_LDH
## Model 2: ~ B_sex + B_age + B_paladj + B_ther_prev + B_ther_cur_regrouped.1 + B_lab_LDH
## loglik Chisq Df Pr(>|Chi|)
## 1 -105.09
## 2 -107.93 5.6934 2 0.05804 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Than, we check whether using splines for MZvt_HWK gives
a better fit than adding it as a linear variable (compare a model with
splines for MZvt_HWK (Model 1) to a model with
MZvt_HWK as is (Model 2)).
anova(cox_rcs.adj,update(cox_rcs.adj,.~.-rms::rcs(MZvt_HWK, k.pos.MZvt_HWK)+MZvt_HWK))
## Analysis of Deviance Table
## Cox model: response is Surv(OS_days, OS_status)
## Model 1: ~ rms::rcs(MZvt_HWK, k.pos.MZvt_HWK) + B_sex + B_age + B_paladj + B_ther_prev + B_ther_cur_regrouped.1 + B_lab_LDH
## Model 2: ~ B_sex + B_age + B_paladj + B_ther_prev + B_ther_cur_regrouped.1 + B_lab_LDH + MZvt_HWK
## loglik Chisq Df Pr(>|Chi|)
## 1 -105.09
## 2 -107.75 5.3294 1 0.02097 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Make dummy dataset.
newdf <- with(OS_melanoma,
expand.grid(MZvt_HWK=c(seq(min(MZvt_HWK),max(MZvt_HWK),length=200), k.pos.MZvt_HWK)
, B_sex = levels(B_sex)[1] #reference level for sex
, B_age = 0 #reference age for age (note, you could also use e.g. median, this wouldn't change the results)
, B_paladj = levels(B_paladj)[1]
, B_ther_prev = levels(B_ther_prev)[1]
, B_ther_cur_regrouped.1 = levels(B_ther_cur_regrouped.1)[1]
, B_lab_LDH = 0
))
Now we make a model matrix (= some technical stuff needed for next step).
mm <- model.matrix(cox_rcs.adj,data=newdf)
mm[,!colnames(mm) %in% colnames(mm)[grep("MZvt_HWK",colnames(mm))]] <- 0 # set all variables not about MZvt_HWK to 0
For each value of our sequence of MZvt_HWK, we obtain
its predicted coefficient (=eta) and predicted standard
error (=pse). With these, we compute the predicted 95%
conficence intervalls (plo and phi for lower
and upper CI). We store these in a data frame called
est.
est <- data.frame(MZvt_HWK=newdf$MZvt_HWK,eta=mm%*%coef(cox_rcs.adj)) #predicted coefficient
pvar <- diag(mm %*% tcrossprod(vcov(cox_rcs.adj),mm)) #predicted variance
est <- data.frame(est,pse=sqrt(pvar)) #predicted standard error
est <- with(est,
data.frame(est,
plo=eta-qnorm(0.975)*pse, #predicted lower 95%CI bound
phi=eta+qnorm(0.975)*pse)) #predicted upper 95%CI bound
For Cox regression and logistic regression we need to exponentiate the coefficient and confidence bounds to obtain HR and 95% CI.
est[,c("eta","plo","phi")] <- exp(est[,c("eta","plo","phi")])
We remove duplicates (e.g. if we have two of the same values for
MZvt_HWK) and sort on MZvt_HWK.
est <- est[!duplicated(est),] #remove duplicates
est <- est[order(est$MZvt_HWK),] #sort based on var1 order
p_splines_OS_MVPASL_mel <- ggplot(data = est
)+ geom_ribbon(aes(x=MZvt_HWK, ymin = plo, ymax = phi) #color 95%CI
, colour = rgb(248,136,44,maxColorValue = 255,alpha=100)
, fill=rgb(252,211,178,maxColorValue = 255,alpha=100)
, alpha=0.4
)+ geom_hline(yintercept=1 #add horizontal line at 1
, color="dark grey"
, linetype="dashed"
, linewidth=1
)+ geom_line(aes(x=MZvt_HWK,y=eta) #add estimated HR
, color = "coral3"
, linewidth = 1
)+ geom_point(data=est[est$MZvt_HWK%in%k.pos.MZvt_HWK,] #to plot the places of the knots
, aes(x=MZvt_HWK, y=eta)
)+ geom_rug(data = OS_melanoma, aes(x=MZvt_HWK) #add rugs representing real observations for MZvt_HWK
, sides = "b"
, length = grid::unit(0.02, "npc")
, color = "grey"
, alpha = .5
)+ scale_x_continuous(expand = c(0,0) #set axes directly at 0
)+ scale_y_log10(breaks = unique(c(0.1,seq(0,1,.2), seq(0,1,.2)*10))
# )+ annotation_logticks(base = 10, sides = "l", outside=F , scaled = T, color = "black",alpha = .5
)+ coord_cartesian(#xlim = c(0, 30) #set x-axis
ylim = c(0.1,10) #set y-axis
# )+ ggtitle("" #add title
)+ xlab("MVPA-SL (hours/week)" #add x-axis label
)+ ylab("Hazard ratio for death" #add y-axis label
)+ theme_light( #add/alter theme
)+ theme(plot.margin = margin(0.5,3,0.5,0.5, unit = "char")
)
p_splines_OS_MVPASL_mel
Make Supplementary Figure 4.
KM_OS_MET_mel <- ggarrange(KM_OS_MET_mel$plot,KM_OS_MET_mel$table
, ncol=1, nrow=2
, heights= c(.75,.25))
KM_OS_MVPASL_mel <- ggarrange(KM_OS_MVPASL_mel$plot,KM_OS_MVPASL_mel$table
, ncol=1, nrow=2
, heights= c(.75,.25))
suppl_figure5 <- ggarrange(KM_OS_MET_mel, p_splines_OS_MET_mel
,KM_OS_MVPASL_mel, p_splines_OS_MVPASL_mel
,labels = c("A","B","C","D")
,ncol=2, nrow=2
,widths = c(1,1), heights = c(1,1)
,common.legend=F
)
suppl_figure5
Save as PDF.
Make OS_PS.
OS_PS <- data_OS[data_OS$B_PS%in%c("0","1"),]
table(OS_PS$B_ther_cur_regrouped.1, useNA = "always")
##
## anti-PD-(L)1 cICI
## 151 53
## ICI + chemo/targeted therapy <NA>
## 23 0
table(OS_PS$B_tumor_type.1, useNA = "always")
##
## melanoma NSCLC RCC other <NA>
## 144 34 20 29 0
Specify data frame to use
data <- OS_PS
Relabel variables and add units that you want to use in your table (all formatting to let the table look nicely):
label(data$B_sex) <- "sex"
label(data$B_age) <- "age"
units(data$B_age) <- "years"
label(data$B_PS) <- "ECOG performance status"
label(data$B_tumor_type.1) <- "tumor type"
label(data$B_tumor_stage_dich) <- "stage"
label(data$B_paladj) <- "treatment intent"
label(data$B_ther_prev) <- "previous systemic therapy"
label(data$B_ther_cur_regrouped.1) <- "therapy"
label(data$T1_gr3_1y) <- "severe irAE within 1 year"
label(data$totmet_hpwk_tertiles) <- "total PA"
label(data$totmet_hpwk) <- "total PA"
units(data$totmet_hpwk) <- "hours/week"
label(data$MZvt_HWK) <- "MVPA-SL"
units(data$MZvt_HWK) <- "hours/week"
Make the actual table:
~ B_sex + B_age + ... + B_ther_cur_regrouped.1 defines
the rows;| totmet_hpwk_tertiles defines the columns;table1(~ totmet_hpwk + MZvt_HWK + B_sex + B_age + B_PS + B_tumor_type.1 + B_tumor_stage_dich + B_paladj + B_ther_prev + B_ther_cur_regrouped.1| totmet_hpwk_tertiles,
data = data,
overall = "Overall",
# caption = "Table 1: baseline characteristics of combined ipilimumab",
# footnote = "RCC: renal cell carcinoma",
topclass = "Rtable1-zebra",
render.categorical = my.render.cat,
render.continuous = my.render.cont,#c(.="median [Q1, Q3]"),
render.missing = my.render.missing,
droplevels = TRUE)
| [0,51] (N=69) |
(51,101] (N=80) |
(101,371] (N=78) |
Overall (N=227) |
|
|---|---|---|---|---|
| total PA (hours/week) | ||||
| median [Q1-Q3] | 35.1 [22.9-41.3] | 75.4 [62.0-85.9] | 131.1 [112.6-173.0] | 77.8 [42.2-113.6] |
| mean (SD) | 30.8 (12.8) | 74.6 (13.7) | 149.8 (52.1) | 87.1 (58.5) |
| MVPA-SL (hours/week) | ||||
| median [Q1-Q3] | 1.0 [0.0-3.0] | 5.0 [1.9-8.1] | 8.9 [4.0-15.0] | 4.0 [1.0-8.5] |
| mean (SD) | 1.6 (2.1) | 5.6 (4.6) | 10.5 (8.1) | 6.1 (6.6) |
| sex | ||||
| male | 41 (59.4%) | 58 (72.5%) | 48 (61.5%) | 147 (64.8%) |
| female | 28 (40.6%) | 22 (27.5%) | 30 (38.5%) | 80 (35.2%) |
| age (years) | ||||
| median [Q1-Q3] | 66.0 [59.0-73.0] | 65.0 [54.0-73.0] | 61.0 [54.0-69.0] | 64.0 [54.0-72.0] |
| mean (SD) | 63.9 (12.8) | 62.8 (13.1) | 60.5 (10.8) | 62.4 (12.3) |
| ECOG performance status | ||||
| 0 | 31 (44.9%) | 39 (48.8%) | 56 (71.8%) | 126 (55.5%) |
| 1 | 38 (55.1%) | 41 (51.2%) | 22 (28.2%) | 101 (44.5%) |
| 2 | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) |
| 3 | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) |
| 4 | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) |
| unknown | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) |
| tumor type | ||||
| melanoma | 38 (55.1%) | 51 (63.8%) | 55 (70.5%) | 144 (63.4%) |
| NSCLC | 14 (20.3%) | 8 (10.0%) | 12 (15.4%) | 34 (15.0%) |
| RCC | 4 (5.8%) | 9 (11.2%) | 7 (9.0%) | 20 (8.8%) |
| other | 13 (18.8%) | 12 (15.0%) | 4 (5.1%) | 29 (12.8%) |
| stage | ||||
| III | 22 (31.9%) | 37 (46.2%) | 39 (50.0%) | 98 (43.2%) |
| IV | 46 (66.7%) | 41 (51.2%) | 39 (50.0%) | 126 (55.5%) |
| other | 1 (1.4%) | 2 (2.5%) | 0 (0.0%) | 3 (1.3%) |
| treatment intent | ||||
| palliative | 48 (69.6%) | 44 (55.0%) | 38 (48.7%) | 130 (57.3%) |
| adjuvant | 21 (30.4%) | 36 (45.0%) | 40 (51.3%) | 97 (42.7%) |
| previous systemic therapy | ||||
| no | 51 (73.9%) | 70 (87.5%) | 66 (84.6%) | 187 (82.4%) |
| yes | 18 (26.1%) | 10 (12.5%) | 12 (15.4%) | 40 (17.6%) |
| therapy | ||||
| anti-PD-(L)1 | 44 (63.8%) | 57 (71.2%) | 50 (64.1%) | 151 (66.5%) |
| cICI | 16 (23.2%) | 18 (22.5%) | 19 (24.4%) | 53 (23.3%) |
| ICI + chemo/targeted therapy | 9 (13.0%) | 5 (6.2%) | 9 (11.5%) | 23 (10.1%) |
Indeed we selected only patients with performance status 0 or 1.
We first flip the order of levels of
totmet_hpwk_tertiles, since it appeart that patients with
highest values have better survival and thus it is more intuitive.
OS_PS$totmet_hpwk_tertiles_KM <- factor(OS_PS$totmet_hpwk_tertiles
, levels = rev(levels(OS_PS$totmet_hpwk_tertiles)))
fit_tert.crude <- survfit(Surv(OS_days, OS_status)~ totmet_hpwk_tertiles_KM ,
data = OS_PS)
KM_OS_MET_PS <- ggsurvplot(fit_tert.crude
,conf.int = F
,pval = F
,xscale="d_m"
,break.x.by=30.4375*6
# ,title=""
,xlab="Time (months)"
,ylab="Survival probability"
,legend = c(.26,.24)
,legend.title="total PA"
,legend.labs=c(paste0("high ",
levels(OS_PS$totmet_hpwk_tertiles_KM)[1]),
paste0("moderate ",
levels(OS_PS$totmet_hpwk_tertiles_KM)[2]),
paste0("low ",
levels(OS_PS$totmet_hpwk_tertiles_KM)[3]))
,axes.offset=F
,surv.median.line = "none"
,consor=T
,censor.shape=124
,censor.size=2
,risk.table = T
,risk.table.y.text=F
,fontsize=3.5
,ggtheme = theme_survminer(font.x = 12, font.y=12, font.main=12, font.tickslab=10, font.legend=10)
,tables.theme = theme_cleantable(base_size = 10, font.x=10, font.y=10, font.main=12)
)
KM_OS_MET_PS
KM_OS_MET_PS$plot <- KM_OS_MET_PS$plot +
theme(legend.background = element_rect(fill=alpha("white", alpha=0)))
Adjusted Cox regression for tertiles linear variable
cox_tert.adj <- coxph(Surv(OS_days, OS_status)~ totmet_hpwk_tertiles + B_sex + B_age + B_paladj + B_ther_prev + B_ther_cur_regrouped.1,
data = OS_PS)
tbl_regression(cox_tert.adj, exponentiate=T)
| Characteristic | HR1 | 95% CI1 | p-value |
|---|---|---|---|
| totmet_hpwk_tertiles | |||
| [0,51] | — | — | |
| (51,101] | 0.74 | 0.39, 1.38 | 0.3 |
| (101,371] | 0.61 | 0.32, 1.15 | 0.12 |
| B_sex | |||
| male | — | — | |
| female | 1.53 | 0.90, 2.59 | 0.12 |
| B_age | 0.99 | 0.97, 1.01 | 0.6 |
| B_paladj | |||
| palliative | — | — | |
| adjuvant | 0.52 | 0.24, 1.09 | 0.081 |
| B_ther_prev | |||
| no | — | — | |
| yes | 1.97 | 1.07, 3.64 | 0.030 |
| B_ther_cur_regrouped.1 | |||
| anti-PD-(L)1 | — | — | |
| cICI | 1.14 | 0.52, 2.49 | 0.7 |
| ICI + chemo/targeted therapy | 2.22 | 1.06, 4.64 | 0.034 |
| 1 HR = Hazard Ratio, CI = Confidence Interval | |||
Run initial regression.
cox_rcs.adj <- coxph(Surv(OS_days, OS_status)~ rms::rcs(totmet_hpwk, k.pos.totmet_hpwk) + B_sex + B_age + B_paladj + B_ther_prev + B_ther_cur_regrouped.1,
data = OS_PS)
tbl_regression(cox_rcs.adj, exponentiate=T)
| Characteristic | HR1 | 95% CI1 | p-value |
|---|---|---|---|
| totmet_hpwk | |||
| rms::rcs(totmet_hpwk, k.pos.totmet_hpwk)totmet_hpwk | 0.99 | 0.98, 1.00 | 0.067 |
| rms::rcs(totmet_hpwk, k.pos.totmet_hpwk)totmet_hpwk' | 1.01 | 1.00, 1.03 | 0.10 |
| B_sex | |||
| male | — | — | |
| female | 1.51 | 0.89, 2.56 | 0.13 |
| B_age | 0.99 | 0.97, 1.01 | 0.5 |
| B_paladj | |||
| palliative | — | — | |
| adjuvant | 0.51 | 0.24, 1.07 | 0.076 |
| B_ther_prev | |||
| no | — | — | |
| yes | 2.01 | 1.08, 3.71 | 0.027 |
| B_ther_cur_regrouped.1 | |||
| anti-PD-(L)1 | — | — | |
| cICI | 1.10 | 0.50, 2.43 | 0.8 |
| ICI + chemo/targeted therapy | 2.22 | 1.08, 4.59 | 0.031 |
| 1 HR = Hazard Ratio, CI = Confidence Interval | |||
We can check whether using restricted cubic splines improves the model fit (statistically). A p-value <0.05 indicates that the model fit is statistically different.
We can check whether totmet_hpwk improves the model fit
at all (compare a model with splines for totmet_hpwk
(Model 1) to a model without totmet_hpwk
(Model 2)).
anova(cox_rcs.adj,update(cox_rcs.adj,.~.-rms::rcs(totmet_hpwk, k.pos.totmet_hpwk)))
## Analysis of Deviance Table
## Cox model: response is Surv(OS_days, OS_status)
## Model 1: ~ rms::rcs(totmet_hpwk, k.pos.totmet_hpwk) + B_sex + B_age + B_paladj + B_ther_prev + B_ther_cur_regrouped.1
## Model 2: ~ B_sex + B_age + B_paladj + B_ther_prev + B_ther_cur_regrouped.1
## loglik Chisq Df Pr(>|Chi|)
## 1 -265.88
## 2 -267.50 3.2339 2 0.1985
Than, we check whether using splines for totmet_hpwk
gives a better fit than adding it as a linear variable (compare a model
with splines for totmet_hpwk (Model 1) to a model
with totmet_hpwk as is (Model 2)).
anova(cox_rcs.adj,update(cox_rcs.adj,.~.-rms::rcs(totmet_hpwk, k.pos.totmet_hpwk)+totmet_hpwk))
## Analysis of Deviance Table
## Cox model: response is Surv(OS_days, OS_status)
## Model 1: ~ rms::rcs(totmet_hpwk, k.pos.totmet_hpwk) + B_sex + B_age + B_paladj + B_ther_prev + B_ther_cur_regrouped.1
## Model 2: ~ B_sex + B_age + B_paladj + B_ther_prev + B_ther_cur_regrouped.1 + totmet_hpwk
## loglik Chisq Df Pr(>|Chi|)
## 1 -265.88
## 2 -267.10 2.438 1 0.1184
Thus, using restricted cubic splines for totmet_hpwk
does NOT improve our model fit..
Make dummy database.
newdf <- with(OS_PS,
expand.grid(totmet_hpwk=c(seq(min(totmet_hpwk),max(totmet_hpwk),length=300), k.pos.totmet_hpwk)
, B_sex = levels(B_sex)[1] #reference level for sex
, B_age = 0 #reference age for age (note, you could also use e.g. median, this wouldn't change the results)
, B_tumor_type.1 = levels(B_tumor_type.1)[1]
, B_paladj = levels(B_paladj)[1]
, B_ther_prev = levels(B_ther_prev)[1]
, B_ther_cur_regrouped.1 = levels(B_ther_cur_regrouped.1)[1]
))
Now we make a model matrix (= some technical stuff needed for next step).
mm <- model.matrix(cox_rcs.adj,data=newdf)
mm[,!colnames(mm) %in% colnames(mm)[grep("totmet_hpwk",colnames(mm))]] <- 0 # set all variables not about totmet_hpwk to 0
For each value of our sequence of totmet_hpwk, we obtain
its predicted coefficient (=eta) and predicted standard
error (=pse). With these, we compute the predicted 95%
confidence intervals (plo and phi for lower
and upper CI). We store these in a data frame called
est.
est <- data.frame(totmet_hpwk=newdf$totmet_hpwk,eta=mm%*%coef(cox_rcs.adj)) #predicted coefficient
pvar <- diag(mm %*% tcrossprod(vcov(cox_rcs.adj),mm)) #predicted variance
est <- data.frame(est,pse=sqrt(pvar)) #predicted standard error
est <- with(est,
data.frame(est,
plo=eta-qnorm(0.975)*pse, #predicted lower 95%CI bound
phi=eta+qnorm(0.975)*pse)) #predicted upper 95%CI bound
For Cox regression and logistic regression we need to exponentiate the coefficient and confidence bounds to obtain HR and 95% CI.
est[,c("eta","plo","phi")] <- exp(est[,c("eta","plo","phi")])
We remove duplicates (e.g. if we have two of the same values for
totmet_hpwk) and sort on totmet_hpwk.
est <- est[!duplicated(est),] #remove duplicates
est <- est[order(est$totmet_hpwk),] #sort based on var1 order
Now, we can finally make the actual plot.
p_splines_OS_MET_PS <- ggplot(data = est
)+ geom_ribbon(aes(x=totmet_hpwk, ymin = plo, ymax = phi) #color 95%CI
, colour = rgb(248,136,44,maxColorValue = 255,alpha=100)
, fill=rgb(252,211,178,maxColorValue = 255,alpha=100)
, alpha=0.4
)+ geom_hline(yintercept=1 #add horizontal line at 1
, color="dark grey"
, linetype="dashed"
, linewidth=1
)+ geom_line(aes(x=totmet_hpwk,y=eta) #add estimated HR
, color = "coral3"
, linewidth = 1
)+ geom_point(data=est[est$totmet_hpwk%in%k.pos.totmet_hpwk,] #to plot the places of the knots
, aes(x=totmet_hpwk, y=eta)
)+ geom_rug(data = OS_PS, aes(x=totmet_hpwk) #add rugs representing real observations for totmet_hpwk
, sides = "b"
, length = grid::unit(0.02, "npc")
, color = "grey"
, alpha = .3
)+ scale_x_continuous(expand = c(0, 0) #set axes directly at 0
)+ scale_y_log10(breaks = unique(c(0.1,seq(0,1,.2), seq(0,1,.2)*10))
# )+ annotation_logticks(base = 10, sides = "l", outside=F , scaled = T, color = "black",alpha = .5
)+ coord_cartesian(#xlim = c(0, 201) #set x-axis
ylim = c(0.1,10) #set y-axis
# )+ ggtitle("" #add title
)+ xlab("total PA (MET-hours/week)" #add x-axis label
)+ ylab("Hazard ratio for death" #add y-axis label
)+ theme_light( #add/alter theme
)+ theme(plot.margin = margin(0.5,3,0.5,0.5, unit = "char")
)
p_splines_OS_MET_PS
Kaplan Meier curve for tertiles (=unadjusted)
OS_PS$MZvt_HWK_tertiles_KM <- factor(OS_PS$MZvt_HWK_tertiles
, levels = rev(levels(OS_PS$MZvt_HWK_tertiles)))
fit_tert.crude <- survfit(Surv(OS_days, OS_status)~ MZvt_HWK_tertiles_KM ,
data = OS_PS)
KM_OS_MVPASL_PS <- ggsurvplot(fit_tert.crude
,conf.int = F
,pval = F
,xscale="d_m"
,break.x.by=30.4375*6
# ,title=""
,xlab="Time (months)"
,ylab="Survival probability"
,legend = c(.26,.24)
,legend.title="MVPA-SL"
,legend.labs=c(paste0("high ",
levels(data_OS$MZvt_HWK_tertiles_KM)[1]),
paste0("moderate ",
levels(data_OS$MZvt_HWK_tertiles_KM)[2]),
paste0("low ",
levels(data_OS$MZvt_HWK_tertiles_KM)[3]))
,axes.offset=F
,surv.median.line = "none"
,consor=T
,censor.shape=124
,censor.size=2
,risk.table = T
,risk.table.y.text=F
,fontsize=3.5
,ggtheme = theme_survminer(font.x = 12, font.y=12, font.main=12, font.tickslab=10, font.legend=10)
,tables.theme = theme_cleantable(base_size = 10, font.x=10, font.y=10, font.main=12)
)
KM_OS_MVPASL_PS
KM_OS_MVPASL_PS$plot <- KM_OS_MVPASL_PS$plot +
theme(legend.background = element_rect(fill=alpha("white", alpha=0)))
Adjusted Cox regression for tertiles linear variable
cox_tert.adj <- coxph(Surv(OS_days, OS_status)~ MZvt_HWK_tertiles + B_sex + B_age + B_paladj + B_ther_prev + B_ther_cur_regrouped.1,
data = OS_PS)
tbl_regression(cox_tert.adj, exponentiate=T)
| Characteristic | HR1 | 95% CI1 | p-value |
|---|---|---|---|
| MZvt_HWK_tertiles | |||
| [0,1.5] | — | — | |
| (1.5,6.5] | 0.66 | 0.36, 1.20 | 0.2 |
| (6.5,38.5] | 0.64 | 0.32, 1.30 | 0.2 |
| B_sex | |||
| male | — | — | |
| female | 1.45 | 0.85, 2.47 | 0.2 |
| B_age | 1.00 | 0.97, 1.02 | 0.7 |
| B_paladj | |||
| palliative | — | — | |
| adjuvant | 0.53 | 0.25, 1.12 | 0.095 |
| B_ther_prev | |||
| no | — | — | |
| yes | 2.02 | 1.09, 3.75 | 0.026 |
| B_ther_cur_regrouped.1 | |||
| anti-PD-(L)1 | — | — | |
| cICI | 1.11 | 0.51, 2.40 | 0.8 |
| ICI + chemo/targeted therapy | 2.26 | 1.09, 4.66 | 0.028 |
| 1 HR = Hazard Ratio, CI = Confidence Interval | |||
Initial regression.
cox_rcs.adj <- coxph(Surv(OS_days, OS_status)~ rms::rcs(MZvt_HWK, k.pos.MZvt_HWK) + B_sex + B_age + B_paladj + B_ther_prev + B_ther_cur_regrouped.1,
data = OS_PS)
tbl_regression(cox_rcs.adj, exponentiate=T)
| Characteristic | HR1 | 95% CI1 | p-value |
|---|---|---|---|
| MZvt_HWK | |||
| rms::rcs(MZvt_HWK, k.pos.MZvt_HWK)MZvt_HWK | 0.85 | 0.73, 0.98 | 0.026 |
| rms::rcs(MZvt_HWK, k.pos.MZvt_HWK)MZvt_HWK' | 1.31 | 1.02, 1.68 | 0.033 |
| B_sex | |||
| male | — | — | |
| female | 1.43 | 0.84, 2.46 | 0.2 |
| B_age | 1.00 | 0.97, 1.02 | 0.7 |
| B_paladj | |||
| palliative | — | — | |
| adjuvant | 0.51 | 0.24, 1.07 | 0.076 |
| B_ther_prev | |||
| no | — | — | |
| yes | 2.09 | 1.12, 3.88 | 0.020 |
| B_ther_cur_regrouped.1 | |||
| anti-PD-(L)1 | — | — | |
| cICI | 1.06 | 0.49, 2.29 | 0.9 |
| ICI + chemo/targeted therapy | 2.14 | 1.04, 4.41 | 0.039 |
| 1 HR = Hazard Ratio, CI = Confidence Interval | |||
We can check whether using restricted cubic splines improves the model fit (statistically). A p-value <0.05 indicates that the model fit is statistically different.
We can check whether MZvt_HWK improves the model fit at
all (compare a model with splines for MZvt_HWK (Model
1) to a model without MZvt_HWK (Model
2)).
anova(cox_rcs.adj,update(cox_rcs.adj,.~.-rms::rcs(MZvt_HWK, k.pos.MZvt_HWK)))
## Analysis of Deviance Table
## Cox model: response is Surv(OS_days, OS_status)
## Model 1: ~ rms::rcs(MZvt_HWK, k.pos.MZvt_HWK) + B_sex + B_age + B_paladj + B_ther_prev + B_ther_cur_regrouped.1
## Model 2: ~ B_sex + B_age + B_paladj + B_ther_prev + B_ther_cur_regrouped.1
## loglik Chisq Df Pr(>|Chi|)
## 1 -265.0
## 2 -267.5 5.0038 2 0.08193 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Than, we check whether using splines for MZvt_HWK gives
a better fit than adding it as a linear variable (compare a model with
splines for MZvt_HWK (Model 1) to a model with
MZvt_HWK as is (Model 2)).
anova(cox_rcs.adj,update(cox_rcs.adj,.~.-rms::rcs(MZvt_HWK, k.pos.MZvt_HWK)+MZvt_HWK))
## Analysis of Deviance Table
## Cox model: response is Surv(OS_days, OS_status)
## Model 1: ~ rms::rcs(MZvt_HWK, k.pos.MZvt_HWK) + B_sex + B_age + B_paladj + B_ther_prev + B_ther_cur_regrouped.1
## Model 2: ~ B_sex + B_age + B_paladj + B_ther_prev + B_ther_cur_regrouped.1 + MZvt_HWK
## loglik Chisq Df Pr(>|Chi|)
## 1 -265.00
## 2 -267.19 4.3749 1 0.03647 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Make dummy dataset.
newdf <- with(OS_PS,
expand.grid(MZvt_HWK=c(seq(min(MZvt_HWK),max(MZvt_HWK),length=200), k.pos.MZvt_HWK)
, B_sex = levels(B_sex)[1] #reference level for sex
, B_age = 0 #reference age for age (note, you could also use e.g. median, this wouldn't change the results)
, B_paladj = levels(B_paladj)[1]
, B_ther_prev = levels(B_ther_prev)[1]
, B_ther_cur_regrouped.1 = levels(B_ther_cur_regrouped.1)[1]
))
Now we make a model matrix (= some technical stuff needed for next step).
mm <- model.matrix(cox_rcs.adj,data=newdf)
mm[,!colnames(mm) %in% colnames(mm)[grep("MZvt_HWK",colnames(mm))]] <- 0 # set all variables not about MZvt_HWK to 0
For each value of our sequence of MZvt_HWK, we obtain
its predicted coefficient (=eta) and predicted standard
error (=pse). With these, we compute the predicted 95%
conficence intervalls (plo and phi for lower
and upper CI). We store these in a data frame called
est.
est <- data.frame(MZvt_HWK=newdf$MZvt_HWK,eta=mm%*%coef(cox_rcs.adj)) #predicted coefficient
pvar <- diag(mm %*% tcrossprod(vcov(cox_rcs.adj),mm)) #predicted variance
est <- data.frame(est,pse=sqrt(pvar)) #predicted standard error
est <- with(est,
data.frame(est,
plo=eta-qnorm(0.975)*pse, #predicted lower 95%CI bound
phi=eta+qnorm(0.975)*pse)) #predicted upper 95%CI bound
For Cox regression and logistic regression we need to exponentiate the coefficient and confidence bounds to obtain HR and 95% CI.
est[,c("eta","plo","phi")] <- exp(est[,c("eta","plo","phi")])
We remove duplicates (e.g. if we have two of the same values for
MZvt_HWK) and sort on MZvt_HWK.
est <- est[!duplicated(est),] #remove duplicates
est <- est[order(est$MZvt_HWK),] #sort based on var1 order
p_splines_OS_MVPASL_PS <- ggplot(data = est
)+ geom_ribbon(aes(x=MZvt_HWK, ymin = plo, ymax = phi) #color 95%CI
, colour = rgb(248,136,44,maxColorValue = 255,alpha=100)
, fill=rgb(252,211,178,maxColorValue = 255,alpha=100)
, alpha=0.4
)+ geom_hline(yintercept=1 #add horizontal line at 1
, color="dark grey"
, linetype="dashed"
, linewidth=1
)+ geom_line(aes(x=MZvt_HWK,y=eta) #add estimated HR
, color = "coral3"
, linewidth = 1
)+ geom_point(data=est[est$MZvt_HWK%in%k.pos.MZvt_HWK,] #to plot the places of the knots
, aes(x=MZvt_HWK, y=eta)
)+ geom_rug(data = OS_PS, aes(x=MZvt_HWK) #add rugs representing real observations for MZvt_HWK
, sides = "b"
, length = grid::unit(0.02, "npc")
, color = "grey"
, alpha = .5
)+ scale_x_continuous(expand = c(0,0) #set axes directly at 0
)+ scale_y_log10(breaks = unique(c(0.1,seq(0,1,.2), seq(0,1,.2)*10))
# )+ annotation_logticks(base = 10, sides = "l", outside=F , scaled = T, color = "black",alpha = .5
)+ coord_cartesian(#xlim = c(0, 30) #set x-axis
ylim = c(0.1,10) #set y-axis
# )+ ggtitle("" #add title
)+ xlab("MVPA-SL (hours/week)" #add x-axis label
)+ ylab("Hazard ratio for death" #add y-axis label
)+ theme_light( #add/alter theme
)+ theme(plot.margin = margin(0.5,3,0.5,0.5, unit = "char")
)
p_splines_OS_MVPASL_PS
Make Supplementary Figure 6.
KM_OS_MET_PS <- ggarrange(KM_OS_MET_PS$plot,KM_OS_MET_PS$table
, ncol=1, nrow=2
, heights= c(.75,.25))
KM_OS_MVPASL_PS <- ggarrange(KM_OS_MVPASL_PS$plot,KM_OS_MVPASL_PS$table
, ncol=1, nrow=2
, heights= c(.75,.25))
suppl_figure6 <- ggarrange(KM_OS_MET_PS, p_splines_OS_MET_PS
,KM_OS_MVPASL_PS, p_splines_OS_MVPASL_PS
,labels = c("A","B","C","D")
,ncol=2, nrow=2
,widths = c(1,1), heights = c(1,1)
,common.legend=F
)
suppl_figure6
Save as PDF.
Here, we will rerun the same analyses as primary analyses, but now with additional adjustment for ECOG performance status. ECOG performance status was missing for 6 patients (2.4%). Electronic patient files were checked, data seemed missing completely at random (MCAR). Given the low number of missing values, missingness under MCAR assumption and the analysis being a sensitivity analysis, a complete case analysis was considered appropriate.
Adjusted Cox regression for tertiles linear variable.
cox_tert.adj <- coxph(Surv(OS_days, OS_status)~ totmet_hpwk_tertiles + B_sex + B_age + B_tumor_type.1 + B_paladj + B_ther_prev + B_ther_cur_regrouped.1, data = data_OS)
cox_tert.adj_PS <- coxph(Surv(OS_days, OS_status)~ totmet_hpwk_tertiles + B_sex + B_age + B_tumor_type.1 + B_paladj + B_ther_prev + B_ther_cur_regrouped.1 + B_PS_tri, data = data_OS)
tbl_merge(list(
tbl_regression(cox_tert.adj, exponentiate=T),
tbl_regression(cox_tert.adj_PS, exponentiate=T)
),
tab_spanner = c("without ECOG PS", "with ECOG PS")
)
| Characteristic | without ECOG PS | with ECOG PS | ||||
|---|---|---|---|---|---|---|
| HR1 | 95% CI1 | p-value | HR1 | 95% CI1 | p-value | |
| totmet_hpwk_tertiles | ||||||
| [0,51] | — | — | — | — | ||
| (51,101] | 0.58 | 0.32, 1.04 | 0.065 | 0.69 | 0.37, 1.29 | 0.2 |
| (101,371] | 0.48 | 0.27, 0.89 | 0.019 | 0.60 | 0.32, 1.11 | 0.10 |
| B_sex | ||||||
| male | — | — | — | — | ||
| female | 1.31 | 0.78, 2.18 | 0.3 | 1.35 | 0.80, 2.27 | 0.3 |
| B_age | 0.99 | 0.97, 1.01 | 0.4 | 0.98 | 0.96, 1.00 | 0.082 |
| B_tumor_type.1 | ||||||
| melanoma | — | — | — | — | ||
| NSCLC | 1.14 | 0.49, 2.64 | 0.8 | 1.45 | 0.65, 3.27 | 0.4 |
| RCC | 3.53 | 1.33, 9.33 | 0.011 | 3.52 | 1.42, 8.70 | 0.006 |
| other | 1.16 | 0.52, 2.61 | 0.7 | 1.22 | 0.54, 2.74 | 0.6 |
| B_paladj | ||||||
| palliative | — | — | — | — | ||
| adjuvant | 0.43 | 0.20, 0.92 | 0.030 | 0.62 | 0.28, 1.36 | 0.2 |
| B_ther_prev | ||||||
| no | — | — | — | — | ||
| yes | 1.87 | 1.03, 3.39 | 0.040 | 1.81 | 0.99, 3.29 | 0.053 |
| B_ther_cur_regrouped.1 | ||||||
| anti-PD-(L)1 | — | — | — | — | ||
| cICI | 0.57 | 0.22, 1.50 | 0.3 | 0.78 | 0.32, 1.92 | 0.6 |
| ICI + chemo/targeted therapy | 1.86 | 0.88, 3.94 | 0.11 | 2.22 | 1.07, 4.62 | 0.032 |
| B_PS_tri | ||||||
| 0 | — | — | ||||
| 1 | 1.58 | 0.87, 2.88 | 0.13 | |||
| 2-3 | 6.21 | 2.69, 14.3 | <0.001 | |||
| 1 HR = Hazard Ratio, CI = Confidence Interval | ||||||
Adjusted Cox regression for continuous linear variable.
cox_lin.adj <- coxph(Surv(OS_days, OS_status)~ totmet_hpwk + B_sex + B_age + B_tumor_type.1 + B_paladj + B_ther_prev + B_ther_cur_regrouped.1 + B_PS_tri, data = data_OS)
# tbl_regression(cox_lin.adj, exponentiate=T)
Shoenfeld residuals plot: it should be possible to draw a horizontal line between the confidence bands. Otherwise, the Cox proportional hazard assumption is violated, meaning that the baseline hazard is not constant over time.
plot(cox.zph(cox_lin.adj, transform = "identity")[1])
abline(h=cox_lin.adj$coefficients[1], col="blue")
Martingale residuals can help decide upon the functional form of a dependent variable: This should give a more or less straight line. If not, consider a transformation of the dependent variable.
fit0 <- update(cox_lin.adj,~.- totmet_hpwk)
resid_X <- resid(fit0, type = "martingale")
if(is.null(fit0$na.action)){
df.tmp <- data.frame(X=data_OS$totmet_hpwk, resid_X=resid_X)
} else {
df.tmp <- data.frame(X=data_OS[-c(fit0$na.action),]$totmet_hpwk, resid_X=resid_X)
}
ggplot(data = df.tmp
)+ geom_point(aes(x=X, y=resid_X), color="black", alpha=0.1
)+ geom_smooth(aes(x=X, y=resid_X), method = loess
)+ labs(x="totmet_hpwk", y="Martingale residuals")
## `geom_smooth()` using formula = 'y ~ x'
fit0 <- update(cox_lin.adj,~.- B_age)
resid_X <- resid(fit0, type = "martingale")
if(is.null(fit0$na.action)){
df.tmp <- data.frame(X=data_OS$B_age, resid_X=resid_X)
} else {
df.tmp <- data.frame(X=data_OS[-c(fit0$na.action),]$B_age, resid_X=resid_X)
}
ggplot(data = df.tmp
)+ geom_point(aes(x=X, y=resid_X), color="black", alpha=0.1
)+ geom_smooth(aes(x=X, y=resid_X), method = loess
)+ labs(x="B_age", y="Martingale residuals")
## `geom_smooth()` using formula = 'y ~ x'
rm(fit0, resid_X, df.tmp)
Conduct survival analysis using splines regression by the
prespecified knots. We make use of the rcs() function in
package rms which provides restricted cubic splines.
cox_rcs.adj <- coxph(Surv(OS_days, OS_status)~ rms::rcs(totmet_hpwk, k.pos.totmet_hpwk) + B_sex + B_age + B_tumor_type.1 + B_paladj + B_ther_prev + B_ther_cur_regrouped.1 + B_PS_tri, data = data_OS)
tbl_regression(cox_rcs.adj, exponentiate=T)
| Characteristic | HR1 | 95% CI1 | p-value |
|---|---|---|---|
| totmet_hpwk | |||
| rms::rcs(totmet_hpwk, k.pos.totmet_hpwk)totmet_hpwk | 0.99 | 0.98, 1.00 | 0.11 |
| rms::rcs(totmet_hpwk, k.pos.totmet_hpwk)totmet_hpwk' | 1.01 | 1.00, 1.02 | 0.14 |
| B_sex | |||
| male | — | — | |
| female | 1.34 | 0.80, 2.24 | 0.3 |
| B_age | 0.98 | 0.96, 1.00 | 0.10 |
| B_tumor_type.1 | |||
| melanoma | — | — | |
| NSCLC | 1.41 | 0.63, 3.16 | 0.4 |
| RCC | 3.32 | 1.34, 8.23 | 0.010 |
| other | 1.18 | 0.52, 2.67 | 0.7 |
| B_paladj | |||
| palliative | — | — | |
| adjuvant | 0.61 | 0.28, 1.33 | 0.2 |
| B_ther_prev | |||
| no | — | — | |
| yes | 1.89 | 1.04, 3.44 | 0.038 |
| B_ther_cur_regrouped.1 | |||
| anti-PD-(L)1 | — | — | |
| cICI | 0.77 | 0.31, 1.90 | 0.6 |
| ICI + chemo/targeted therapy | 2.26 | 1.10, 4.65 | 0.027 |
| B_PS_tri | |||
| 0 | — | — | |
| 1 | 1.62 | 0.90, 2.92 | 0.11 |
| 2-3 | 5.79 | 2.42, 13.9 | <0.001 |
| 1 HR = Hazard Ratio, CI = Confidence Interval | |||
We can check whether using restricted cubic splines improves the model fit (statistically). A p-value <0.05 indicates that the model fit is statistically different.
We can check whether totmet_hpwk improves the model fit
at all (compare a model with splines for totmet_hpwk
(Model 1) to a model without totmet_hpwk
(Model 2)).
anova(cox_rcs.adj,update(cox_rcs.adj,.~.-rms::rcs(totmet_hpwk, k.pos.totmet_hpwk)))
## Analysis of Deviance Table
## Cox model: response is Surv(OS_days, OS_status)
## Model 1: ~ rms::rcs(totmet_hpwk, k.pos.totmet_hpwk) + B_sex + B_age + B_tumor_type.1 + B_paladj + B_ther_prev + B_ther_cur_regrouped.1 + B_PS_tri
## Model 2: ~ B_sex + B_age + B_tumor_type.1 + B_paladj + B_ther_prev + B_ther_cur_regrouped.1 + B_PS_tri
## loglik Chisq Df Pr(>|Chi|)
## 1 -316.70
## 2 -317.97 2.5534 2 0.2789
Then, we check whether using splines for totmet_hpwk
gives a better fit than adding it as a linear variable (compare a model
with splines for totmet_hpwk (Model 1) to a model
with totmet_hpwk as is (Model 2)).
anova(cox_rcs.adj,update(cox_rcs.adj,.~totmet_hpwk+.-rms::rcs(totmet_hpwk, k.pos.totmet_hpwk)))
## Analysis of Deviance Table
## Cox model: response is Surv(OS_days, OS_status)
## Model 1: ~ rms::rcs(totmet_hpwk, k.pos.totmet_hpwk) + B_sex + B_age + B_tumor_type.1 + B_paladj + B_ther_prev + B_ther_cur_regrouped.1 + B_PS_tri
## Model 2: ~ totmet_hpwk + B_sex + B_age + B_tumor_type.1 + B_paladj + B_ther_prev + B_ther_cur_regrouped.1 + B_PS_tri
## loglik Chisq Df Pr(>|Chi|)
## 1 -316.70
## 2 -317.67 1.9459 1 0.163
We want to visualise the hazard rate (HR) for death (in case of
overall survival (OS)) over the trajectory of our variable of interest
(totmet_hpwk). To do this, we will mak a dummy dataset to
predict the HR for certain values of totmet_hpwk. We will
also obtain the 95% confidence interval (CI).
newdf <- with(data_OS,
expand.grid(totmet_hpwk=c(seq(min(totmet_hpwk),max(totmet_hpwk),length=300), k.pos.totmet_hpwk)
, B_sex = levels(B_sex)[1] #reference level for sex
, B_age = 0 #reference age for age (note, you could also use e.g. median, this wouldn't change the results)
, B_tumor_type.1 = levels(B_tumor_type.1)[1]
, B_paladj = levels(B_paladj)[1]
, B_ther_prev = levels(B_ther_prev)[1]
, B_ther_cur_regrouped.1 = levels(B_ther_cur_regrouped.1)[1]
, B_PS_tri = levels(B_PS_tri)[1]
))
Now we make a model matrix, which is needed in the subsequent steps.
mm <- model.matrix(cox_rcs.adj,data=newdf)
mm[,!colnames(mm) %in% colnames(mm)[grep("totmet_hpwk",colnames(mm))]] <- 0 # set all variables not about totmet_hpwk to 0
For each value of our sequence of totmet_hpwk, we obtain
its predicted coefficient (=eta) and predicted standard
error (=pse). With these, we compute the predicted 95%
conficence intervalls (plo and phi for lower
and upper CI). We store these in a data frame called
est.
est <- data.frame(totmet_hpwk=newdf$totmet_hpwk,eta=mm%*%coef(cox_rcs.adj)) #predicted coefficient
pvar <- diag(mm %*% tcrossprod(vcov(cox_rcs.adj),mm)) #predicted variance
est <- data.frame(est,pse=sqrt(pvar)) #predicted standard error
est <- with(est,
data.frame(est,
plo=eta-qnorm(0.975)*pse, #predicted lower 95%CI bound
phi=eta+qnorm(0.975)*pse)) #predicted upper 95%CI bound
For Cox regression and logistic regression we need to exponentiate the coefficient and confidence bounds to obtain HR and 95% CI.
est[,c("eta","plo","phi")] <- exp(est[,c("eta","plo","phi")])
We remove duplicates (e.g. if we have two of the same values for
totmet_hpwk) and sort on totmet_hpwk.
est <- est[!duplicated(est),] #remove duplicates
est <- est[order(est$totmet_hpwk),] #sort based on var1 order
p_splines_OS_MET <- ggplot(data = est
)+ geom_ribbon(aes(x=totmet_hpwk, ymin = plo, ymax = phi) #color 95%CI
, colour = rgb(248,136,44,maxColorValue = 255,alpha=100)
, fill=rgb(252,211,178,maxColorValue = 255,alpha=100)
, alpha=0.4
)+ geom_hline(yintercept=1 #add horizontal line at 1
, color="dark grey"
, linetype="dashed"
, linewidth=1
)+ geom_line(aes(x=totmet_hpwk,y=eta) #add estimated HR
, color = "coral3"
, linewidth= 1
)+ geom_point(data=est[est$totmet_hpwk%in%k.pos.totmet_hpwk,] #to plot the places of the knots
, aes(x=totmet_hpwk, y=eta)
)+ geom_rug(data = data_OS, aes(x=totmet_hpwk) #add rugs representing real observations for totmet_hpwk
, sides = "b"
, length = grid::unit(0.02, "npc")
, color = "grey"
, alpha = .3
)+ scale_x_continuous(expand = c(0, 0) #set axes directly at 0
)+ scale_y_log10(breaks = unique(c(0.1,seq(0,1,.2), seq(0,1,.2)*10))
# )+ annotation_logticks(base = 10, sides = "l", outside=F , scaled = T, color = "black",alpha = .5
)+ coord_cartesian(#xlim = c(0, 201) #set x-axis
ylim = c(0.1,10) #set y-axis
# )+ ggtitle("" #add title
)+ xlab("total PA (MET-hours/week)" #add x-axis label
)+ ylab("Hazard ratio for death" #add y-axis label
)+ theme_light( #add/alter theme
)+ theme(plot.margin = margin(0.5,3,0.5,0.5, unit = "char")
)
p_splines_OS_MET
Adjusted Cox regression for tertiles linear variable.
cox_tert.adj <- coxph(Surv(OS_days, OS_status)~ MZvt_HWK_tertiles + B_sex + B_age + B_tumor_type.1 + B_paladj + B_ther_prev + B_ther_cur_regrouped.1, data = data_OS)
cox_tert.adj_PS <- coxph(Surv(OS_days, OS_status)~ MZvt_HWK_tertiles + B_sex + B_age + B_tumor_type.1 + B_paladj + B_ther_prev + B_ther_cur_regrouped.1 + B_PS_tri, data = data_OS)
tbl_merge(list(
tbl_regression(cox_tert.adj, exponentiate=T),
tbl_regression(cox_tert.adj_PS, exponentiate=T)
),
tab_spanner = c("without ECOG PS", "with ECOG PS")
)
| Characteristic | without ECOG PS | with ECOG PS | ||||
|---|---|---|---|---|---|---|
| HR1 | 95% CI1 | p-value | HR1 | 95% CI1 | p-value | |
| MZvt_HWK_tertiles | ||||||
| [0,1.5] | — | — | — | — | ||
| (1.5,6.5] | 0.70 | 0.41, 1.21 | 0.2 | 0.80 | 0.46, 1.40 | 0.4 |
| (6.5,38.5] | 0.54 | 0.28, 1.04 | 0.066 | 0.66 | 0.33, 1.29 | 0.2 |
| B_sex | ||||||
| male | — | — | — | — | ||
| female | 1.18 | 0.70, 1.97 | 0.5 | 1.26 | 0.74, 2.13 | 0.4 |
| B_age | 1.00 | 0.98, 1.02 | 0.8 | 0.99 | 0.97, 1.01 | 0.2 |
| B_tumor_type.1 | ||||||
| melanoma | — | — | — | — | ||
| NSCLC | 1.06 | 0.44, 2.54 | 0.9 | 1.40 | 0.62, 3.21 | 0.4 |
| RCC | 3.29 | 1.24, 8.74 | 0.017 | 3.36 | 1.36, 8.33 | 0.009 |
| other | 1.25 | 0.56, 2.80 | 0.6 | 1.29 | 0.57, 2.89 | 0.5 |
| B_paladj | ||||||
| palliative | — | — | — | — | ||
| adjuvant | 0.46 | 0.21, 1.00 | 0.050 | 0.68 | 0.31, 1.50 | 0.3 |
| B_ther_prev | ||||||
| no | — | — | — | — | ||
| yes | 2.01 | 1.10, 3.67 | 0.023 | 1.92 | 1.06, 3.51 | 0.033 |
| B_ther_cur_regrouped.1 | ||||||
| anti-PD-(L)1 | — | — | — | — | ||
| cICI | 0.60 | 0.23, 1.57 | 0.3 | 0.81 | 0.33, 2.00 | 0.6 |
| ICI + chemo/targeted therapy | 1.99 | 0.93, 4.25 | 0.074 | 2.38 | 1.15, 4.92 | 0.019 |
| B_PS_tri | ||||||
| 0 | — | — | ||||
| 1 | 1.66 | 0.92, 2.99 | 0.091 | |||
| 2-3 | 6.75 | 2.95, 15.4 | <0.001 | |||
| 1 HR = Hazard Ratio, CI = Confidence Interval | ||||||
Adjusted Cox regression for continuous linear variable.
cox_lin.adj <- coxph(Surv(OS_days, OS_status)~ MZvt_HWK + B_sex + B_age + B_tumor_type.1 + B_paladj + B_ther_prev + B_ther_cur_regrouped.1, data = data_OS)
# tbl_regression(cox_lin.adj, exponentiate=T)
Shoenfeld residuals plot: it should be possible to draw a horizontal line between the confidence bands. Otherwise, the Cox proportional hazard assumption is violated, meaning that the baseline hazard is not constant over time.
plot(cox.zph(cox_lin.adj, transform = "identity")[1])
abline(h=cox_lin.adj$coefficients[1], col="blue")
Martingale residuals can help decide upon the functional form of a dependent variable: This should give a more or less straight line. If not, consider a transformation of the dependent variable.
fit0 <- update(cox_lin.adj,~.- MZvt_HWK)
resid_X <- resid(fit0, type = "martingale")
if(is.null(fit0$na.action)){
df.tmp <- data.frame(X=data_OS$MZvt_HWK, resid_X=resid_X)
} else {
df.tmp <- data.frame(X=data_OS[-c(fit0$na.action),]$MZvt_HWK, resid_X=resid_X)
}
ggplot(data = df.tmp
)+ geom_point(aes(x=X, y=resid_X), color="black", alpha=0.1
)+ geom_smooth(aes(x=X, y=resid_X), method = loess
)+ labs(x="MZvt_HWK", y="Martingale residuals")
## `geom_smooth()` using formula = 'y ~ x'
fit0 <- update(cox_lin.adj,~.- B_age)
resid_X <- resid(fit0, type = "martingale")
if(is.null(fit0$na.action)){
df.tmp <- data.frame(X=data_OS$B_age, resid_X=resid_X)
} else {
df.tmp <- data.frame(X=data_OS[-c(fit0$na.action),]$B_age, resid_X=resid_X)
}
ggplot(data = df.tmp
)+ geom_point(aes(x=X, y=resid_X), color="black", alpha=0.1
)+ geom_smooth(aes(x=X, y=resid_X), method = loess
)+ labs(x="B_age", y="Martingale residuals")
## `geom_smooth()` using formula = 'y ~ x'
rm(fit0, resid_X, df.tmp)
Conduct survival analysis using splines regression by the
prespecified knots. We make use of the rcs() function in
package rms which provides restricted cubic splines.
cox_rcs.adj <- coxph(Surv(OS_days, OS_status)~ rms::rcs(MZvt_HWK, k.pos.MZvt_HWK) + B_sex + B_age + B_tumor_type.1 + B_paladj + B_ther_prev + B_ther_cur_regrouped.1, data = data_OS)
tbl_regression(cox_rcs.adj, exponentiate=T)
| Characteristic | HR1 | 95% CI1 | p-value |
|---|---|---|---|
| MZvt_HWK | |||
| rms::rcs(MZvt_HWK, k.pos.MZvt_HWK)MZvt_HWK | 0.85 | 0.74, 0.96 | 0.012 |
| rms::rcs(MZvt_HWK, k.pos.MZvt_HWK)MZvt_HWK' | 1.30 | 1.03, 1.63 | 0.025 |
| B_sex | |||
| male | — | — | |
| female | 1.17 | 0.70, 1.97 | 0.6 |
| B_age | 1.00 | 0.98, 1.02 | 0.7 |
| B_tumor_type.1 | |||
| melanoma | — | — | |
| NSCLC | 1.05 | 0.44, 2.53 | >0.9 |
| RCC | 3.24 | 1.22, 8.60 | 0.018 |
| other | 1.20 | 0.53, 2.70 | 0.7 |
| B_paladj | |||
| palliative | — | — | |
| adjuvant | 0.44 | 0.20, 0.95 | 0.037 |
| B_ther_prev | |||
| no | — | — | |
| yes | 2.06 | 1.12, 3.77 | 0.019 |
| B_ther_cur_regrouped.1 | |||
| anti-PD-(L)1 | — | — | |
| cICI | 0.57 | 0.22, 1.49 | 0.3 |
| ICI + chemo/targeted therapy | 1.93 | 0.91, 4.09 | 0.088 |
| 1 HR = Hazard Ratio, CI = Confidence Interval | |||
We can check whether using restricted cubic splines improves the model fit (statistically). A p-value <0.05 indicates that the model fit is statistically different.
We can check whether MZvt_HWK improves the model fit at
all (compare a model with splines for MZvt_HWK (Model
1) to a model without MZvt_HWK (Model
2)).
anova(cox_rcs.adj,update(cox_rcs.adj,.~.-rms::rcs(MZvt_HWK, k.pos.MZvt_HWK)))
## Analysis of Deviance Table
## Cox model: response is Surv(OS_days, OS_status)
## Model 1: ~ rms::rcs(MZvt_HWK, k.pos.MZvt_HWK) + B_sex + B_age + B_tumor_type.1 + B_paladj + B_ther_prev + B_ther_cur_regrouped.1
## Model 2: ~ B_sex + B_age + B_tumor_type.1 + B_paladj + B_ther_prev + B_ther_cur_regrouped.1
## loglik Chisq Df Pr(>|Chi|)
## 1 -327.03
## 2 -330.37 6.664 2 0.03572 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Than, we check whether using splines for MZvt_HWK gives
a better fit than adding it as a linear variable (compare a model with
splines for MZvt_HWK (Model 1) to a model with
MZvt_HWK as is (Model 2)).
anova(cox_rcs.adj,update(cox_rcs.adj,.~.-rms::rcs(MZvt_HWK, k.pos.MZvt_HWK)+MZvt_HWK))
## Analysis of Deviance Table
## Cox model: response is Surv(OS_days, OS_status)
## Model 1: ~ rms::rcs(MZvt_HWK, k.pos.MZvt_HWK) + B_sex + B_age + B_tumor_type.1 + B_paladj + B_ther_prev + B_ther_cur_regrouped.1
## Model 2: ~ B_sex + B_age + B_tumor_type.1 + B_paladj + B_ther_prev + B_ther_cur_regrouped.1 + MZvt_HWK
## loglik Chisq Df Pr(>|Chi|)
## 1 -327.03
## 2 -329.43 4.7839 1 0.02873 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We want to visualise the hazard rate (HR) for death (in case of
overall survival (OS)) over the trajectory of our variable of interest
(MZvt_HWK). To do this, we will mak a dummy dataset to
predict the HR for certain values of MZvt_HWK. We will also
obtain the 95% confidence interval (CI).
newdf <- with(data_OS,
expand.grid(MZvt_HWK=c(seq(min(MZvt_HWK),max(MZvt_HWK),length=300), k.pos.MZvt_HWK)
, B_sex = levels(B_sex)[1] #reference level for sex
, B_age = 0 #reference age for age (note, you could also use e.g. median, this wouldn't change the results)
, B_tumor_type.1 = levels(B_tumor_type.1)[1]
, B_paladj = levels(B_paladj)[1]
, B_ther_prev = levels(B_ther_prev)[1]
, B_ther_cur_regrouped.1 = levels(B_ther_cur_regrouped.1)[1]
, B_PS_tri = levels(B_PS_tri)[1]
))
Now we make a model matrix, which is needed in the subsequent steps.
mm <- model.matrix(cox_rcs.adj,data=newdf)
mm[,!colnames(mm) %in% colnames(mm)[grep("MZvt_HWK",colnames(mm))]] <- 0 # set all variables not about MZvt_HWK to 0
For each value of our sequence of MZvt_HWK, we obtain
its predicted coefficient (=eta) and predicted standard
error (=pse). With these, we compute the predicted 95%
conficence intervalls (plo and phi for lower
and upper CI). We store these in a data frame called
est.
est <- data.frame(MZvt_HWK=newdf$MZvt_HWK,eta=mm%*%coef(cox_rcs.adj)) #predicted coefficient
pvar <- diag(mm %*% tcrossprod(vcov(cox_rcs.adj),mm)) #predicted variance
est <- data.frame(est,pse=sqrt(pvar)) #predicted standard error
est <- with(est,
data.frame(est,
plo=eta-qnorm(0.975)*pse, #predicted lower 95%CI bound
phi=eta+qnorm(0.975)*pse)) #predicted upper 95%CI bound
For Cox regression and logistic regression we need to exponentiate the coefficient and confidence bounds to obtain HR and 95% CI.
est[,c("eta","plo","phi")] <- exp(est[,c("eta","plo","phi")])
We remove duplicates (e.g. if we have two of the same values for
MZvt_HWK) and sort on MZvt_HWK.
est <- est[!duplicated(est),] #remove duplicates
est <- est[order(est$MZvt_HWK),] #sort based on var1 order
p_splines_OS_MVPASL <- ggplot(data = est
)+ geom_ribbon(aes(x=MZvt_HWK, ymin = plo, ymax = phi) #color 95%CI
, colour = rgb(248,136,44,maxColorValue = 255,alpha=100)
, fill=rgb(252,211,178,maxColorValue = 255,alpha=100)
, alpha=0.4
)+ geom_hline(yintercept=1 #add horizontal line at 1
, color="dark grey"
, linetype="dashed"
, linewidth=1
)+ geom_line(aes(x=MZvt_HWK,y=eta) #add estimated HR
, color = "coral3"
, linewidth = 1
)+ geom_point(data=est[est$MZvt_HWK%in%k.pos.MZvt_HWK,] #to plot the places of the knots
, aes(x=MZvt_HWK, y=eta)
)+ geom_rug(data = data_OS, aes(x=MZvt_HWK) #add rugs representing real observations for MZvt_HWK
, sides = "b"
, length = grid::unit(0.02, "npc")
, color = "grey"
, alpha = .5
)+ scale_x_continuous(expand = c(0,0) #set axes directly at 0
)+ scale_y_log10(breaks = unique(c(0.1,seq(0,1,.2), seq(0,1,.2)*10))
# )+ annotation_logticks(base = 10, sides = "l", outside=F , scaled = T, color = "black",alpha = .5
)+ coord_cartesian(#xlim = c(0, 30) #set x-axis
ylim = c(0.1,10) #set y-axis
# )+ ggtitle("" #add title
)+ xlab("MVPA-SL (hours/week)" #add x-axis label
)+ ylab("Hazard ratio for death" #add y-axis label
)+ theme_light( #add/alter theme
)+ theme(plot.margin = margin(0.5,3,0.5,0.5, unit = "char")
)
p_splines_OS_MVPASL
Make Supplementary Figure 7.
suppl_figure7 <- ggarrange(p_splines_OS_MET , p_splines_OS_MVPASL
,labels = c("A","B")
,ncol=2, nrow=1
,widths = c(1,1), heights = c(1,1)
,common.legend=F
)
suppl_figure7
Save as PDF.
sessionInfo()
## R version 4.3.1 (2023-06-16)
## Platform: aarch64-apple-darwin20 (64-bit)
## Running under: macOS Ventura 13.5.2
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## time zone: Europe/Amsterdam
## tzcode source: internal
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] ggsurvfit_0.3.0 survminer_0.4.9 ggpubr_0.6.0 tidycmprsk_0.2.0
## [5] cmprsk_2.2-11 survival_3.5-5 rms_6.7-0 Hmisc_5.1-0
## [9] ggplot2_3.4.2 gtsummary_1.7.1 table1_1.4.3 tidyr_1.3.0
## [13] dplyr_1.1.2
##
## loaded via a namespace (and not attached):
## [1] gridExtra_2.3 sandwich_3.0-2 rlang_1.1.1
## [4] magrittr_2.0.3 multcomp_1.4-25 polspline_1.1.22
## [7] compiler_4.3.1 mgcv_1.8-42 systemfonts_1.0.4
## [10] vctrs_0.6.3 quantreg_5.95 stringr_1.5.0
## [13] pkgconfig_2.0.3 fastmap_1.1.1 backports_1.4.1
## [16] ellipsis_0.3.2 labeling_0.4.2 KMsurv_0.1-5
## [19] utf8_1.2.3 rmarkdown_2.22 markdown_1.7
## [22] haven_2.5.2 ragg_1.2.5 MatrixModels_0.5-1
## [25] purrr_1.0.1 xfun_0.39 cachem_1.0.8
## [28] labelled_2.11.0 jsonlite_1.8.5 highr_0.10
## [31] broom_1.0.5 cluster_2.1.4 R6_2.5.1
## [34] bslib_0.5.0 stringi_1.7.12 car_3.1-2
## [37] rpart_4.1.19 jquerylib_0.1.4 Rcpp_1.0.10
## [40] knitr_1.43 zoo_1.8-12 base64enc_0.1-3
## [43] Matrix_1.5-4.1 splines_4.3.1 nnet_7.3-19
## [46] tidyselect_1.2.0 rstudioapi_0.14 abind_1.4-5
## [49] yaml_2.3.7 ggtext_0.1.2 codetools_0.2-19
## [52] lattice_0.21-8 tibble_3.2.1 withr_2.5.0
## [55] evaluate_0.21 foreign_0.8-84 xml2_1.3.4
## [58] survMisc_0.5.6 pillar_1.9.0 carData_3.0-5
## [61] checkmate_2.2.0 generics_0.1.3 hms_1.1.3
## [64] munsell_0.5.0 commonmark_1.9.0 scales_1.2.1
## [67] xtable_1.8-4 glue_1.6.2 tools_4.3.1
## [70] data.table_1.14.8 SparseM_1.81 ggsignif_0.6.4
## [73] forcats_1.0.0 mvtnorm_1.2-2 cowplot_1.1.1
## [76] grid_4.3.1 colorspace_2.1-0 patchwork_1.1.2
## [79] nlme_3.1-162 htmlTable_2.4.1 Formula_1.2-5
## [82] cli_3.6.1 km.ci_0.5-6 textshaping_0.3.6
## [85] fansi_1.0.4 broom.helpers_1.13.0 gt_0.9.0
## [88] gtable_0.3.3 rstatix_0.7.2 sass_0.4.6
## [91] digest_0.6.31 TH.data_1.1-2 htmlwidgets_1.6.2
## [94] farver_2.1.1 htmltools_0.5.5 lifecycle_1.0.3
## [97] hardhat_1.3.0 gridtext_0.1.5 MASS_7.3-60